Razor&Pyzor &DCC-snowtty-ChinaUnix博客

冰雪塵埃snowtty.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

snowtty

博客访问： 5388594
博文数量： 1144
博客积分： 11974
博客等级：上将
技术积分： 12312
用户组：普通用户
注册时间： 2005-04-13 20:06

文章分类

全部博文（1144）

技术--Oracle&MyS（112）
编程--python编程（21）
编程--perl程序（183）
技术--防火墙类（9）
技术--samba类（7）
技术--apache类（18）
技术--netfilter（7）
工作--工作??（62）
生活--情感生活（116）
学习--英语学习（24）
学习--早先日志（46）
工作--周报总结（36）
学习--学习笔记（45）
技术--Rsync维护（11）
技术--OPENldap（1）
技术--squid维护（9）
技术--DNS 维护（17）
技术--FTP 维护（7）
技术--qmail维护（128）
技术--网络技术（26）
技术--linux 类（183）

openvpn（0）

nagios（10）
编程--awk&sed（11）
编程--shell编程（50）
未分配的博文（15）

文章存档

2017年（2）

2016年（14）

2015年（10）

2014年（28）

2013年（23）

2012年（29）

2011年（53）

2010年（86）

2009年（83）

2008年（43）

2007年（153）

2006年（575）

2005年（45）

我的朋友

最近访客

推荐博文

Razor&Pyzor &DCC

分类：

2006-02-18 09:19:00

Network Tests  Free Chapter

SpamAssassin on its own can detect a high proportion of spam. By using network tests, spam detection can be further improved. SpamAssassin includes support for Realtime BlockLists (RBLs) and Spam URI Realtime BlockLists (SURBLs). All these external services are easy to integrate into SpamAssassin.

The effectiveness of network tests varies from a 60% detection rate upwards. By using them in conjunction with SpamAssassin, spam detection rates are much higher, typically over 95%! However, network tests slow down spam detection. This means that the SpamAssassin processes will take longer to complete and will increase the memory usage of the email server.

This chapter describes the support SpamAssassin has for RBLs and SURBLs, and focuses on three external services:

```
Vipul's Razor 
```
```
Pyzor 
```

The Distributed Checksum Clearinghouse (DCC)

RBLs are blocklists of known sources of spam. By default, SpamAssassin uses a number of RBLs to check the source of the email.

A SURBL is a blocklist of Universal Resource Identifiers (URIs) that appear in spam email. They filter spam by having a list of websites that have been advertised in spam emails. SpamAssassin includes support for SURBLs in version 3.0, and a plug-in is available for version 2.63.

Razor, Pyzor, and DCC operate by comparing incoming emails with known spam. They allow clients to query their database to determine if an email is likely to be spam. However, there is a difference in operation?the Razor database contains only spam emails, whereas Pyzor and the DCC have a database of all emails that have been tested, and keep a count of how often they have been submitted for testing. Bulk emails are indicated by a high number of reports. In other words, Razor is a spam detecting network, and Pyzor and the DCC are bulk email detecting networks.

Razor is currently in version 2, known as Razor2.Within this chapter, Razor2 will be referred to as Razor to aid readability. Razor uses a distributed network of many servers, and only spam is reported to Razor. It is highly reliable; there are rarely false positives, and it recognizes around 25% of spam.

Pyzor uses a single server and tracks all emails, not just spam emails. Spam is detected by a high number of reports rather than being explicitly identified as spam.

The Distributed Checksum Clearinghouse, as its name implies, uses a distributed approach. Its mode of operation is that all emails are reported to it, and counted. Bulk emails will have high counts and can thus be recognized as spam. At the time of writing, there are approximately 200 machines in the DCC network. The servers exchange spam details with each other, to react quickly to new spam.

All three services are free. However, if an organization uses distributed services extensively, it can set up a server for its own use and support the service by making it available for public use.

If Razor is used, it will assist others only if spam is reported to Razor. SpamAssassin tags should not be relied upon to identify spam; a human must identify the email as spam in case there is an error. Emails addressed to a spamtrap address can be reported automatically. Spamtraps are discussed later in the chapter. Note that the Razor network must not contain incorrect data, or its effectiveness drops.

Razor, Pyzor, and DCC rely on checksums. A checksum is a small number or code made from a larger number or message. It is similar to a check digit in a credit-card number or airline ticket number. The checksums are calculated by a client application and transmitted to the server, which compares them with checksums of other emails. As checksums are small, network traffic is minimal, and so is the processing required to perform the comparison against the database of known spam. For example, DCC typically transmits 100 bytes (less than two lines of text) when querying an email message. This is a fraction of the size of an email message; the headers alone on an email will be several times larger.

Pyzor and DCC benefit from every report of email, spam or ham. Only checksums (and not the whole email) are communicated to the server, so there is no disclosure of confidential data. As checksums change even with a slight change in the message, some parts of the message?for example, those that contain dates and times?are excluded from the checksum calculations. A small network overhead is involved with reporting email. The integration of SpamAssassin and DCC described below will do this automatically, and the Pyzor package also has this ability.

In terms of effectiveness, DCC is generally considered better than the others. However, all these services can be used with SpamAssassin at the same time, the cost being a delay of one or two seconds per email while incoming messages are being processed. Keep in mind that if the servers used are unavailable, email processing will take longer.

A number of RBLs are enabled with the default configuration of SpamAssassin. These are defined in /usr/share/spamassassin/20_dnsbl_tests.cf. An example definition is shown here:

    header RCVD_IN_NJABL            eval:check_rbl('njabl', 'dnsbl.njabl.org.')
    describe RCVD_IN_NJABL          Received via a relay in dnsbl.njabl.org
    tflags RCVD_IN_NJABL            net

One set of definitions appears for each RBL configured. Rule definitions are explained in more detail in Chapter 12.

All the rules include a line that sets tflags to net. This groups the rules as network tests, and allows SpamAssassin to treat them as a group. There are two main reasons for this. The first is that network tests may take a long time to complete, especially at busy times. SpamAssassin uses a timeout for network tests, but it also applies this timeout in a progressive manner. If most of the network tests have completed, SpamAssassin will not wait for the last tests to complete. Specific details are given in the Mail::SpamAssassin::Conf main page under the rbl_timeout entry. The second reason for grouping all the tests together is so they can be switched on or off with a single configuration directive.

To disable RBL tests, set skip_rbl_checks to 1 in /etc/mail/spamassassin/local.cf for a site-wide change, or ~/.spamassassin/user_prefs for a change that influences only one user. To enable RBL tests, set skip_rbl_checks to 0.

    skip_rbl_checks  0
    rbl_timeout      15

The timeout for network tests is set with the rbl_timeout configuration directive, also placed in local.cf or user_prefs. This specifies the timeout in seconds.

New RBLs that support the same interface as existing ones can be added by adding new rules similar to the example above. The definitions can be added to the site-wide configuration by adding them to any file matching *.cf in the directory /etc/mail/spamassassin/. The rules can be added on a per-user basis by adding them to ~/.spamassassin/user_prefs.

As described in Chapter 5, each RBL has its own criteria for listing an address in a blacklist. To disable testing a particular RBL, for a site-wide change, set the score for the appropriate rule to 0 in /etc/mail/spamassassin/local.cf, or in ~/.spamassassin/user_prefs for a change that influences only one user. In the example above, the score could be disabled as follows:

    score RCVD_IN_NJABL 0

Spam URI Realtime BlockLists are a relatively recent technique and SpamAssassin 3.0 supports a relatively small number of SURBLs. SURBLs are configured much like RBLs. SpamAssassin 2.63 can use a different plug-in, described later. Details on SURBLs can be found at .

The SURBLs are defined in /usr/share/spamassassin/25_uribl.cf. An example definition is shown below:

    uridnsbl        URIBL_SBL       sbl.spamhaus.org.       TXT
    header          URIBL_SBL       eval:check_uridnsbl('URIBL_SBL')
    describe        URIBL_SBL       Contains a URL listed in the SBL blocklist
    tflags          URIBL_SBL       net

One set of definitions appears for each SURBL configured.

As with RBLs, the SURBL rules set the tflags to net, to enable timeouts to be used, and to enable the rules to be switched on and off together.

SURBLs are implemented as a SpamAssassin plug-in. Plug-ins allow SpamAssassin to be extended with new types of tests and rules without changing SpamAssassin itself. To be enabled, the plug-in must be loaded. On SpamAssassin version 3.0, this is loaded by default. To confirm that the plug-in is loaded, examine /etc/mail/spamassassin/init.pre for a line similar to the following:

    loadplugin Mail::SpamAssassin::Plugin::URIDNSBL

If this line is not present, it can be added to /etc/mail/spamassassin/init.pre. If SpamAssassin is used as a daemon, then spamd will have to be restarted after this change.

Detailed configuration details on SURBLs are available in the Mail::SpamAssassin::Plugin::URIDNSBL main page.

To disable SURBL tests, comment out the loadplugin entry in the init.pre file. This will prevent the module from being loaded.

The timeout for SURBL tests can be altered from the default of two seconds by changing uridnsbl_timeout in /etc/mail/spamassassin/local.cf for a site-wide change, or ~/.spamassassin/user_prefs for a change that affects only one user:

    uridnsbl_timeout     15

New SURBLs that support the same interface as existing ones can be added by adding new rules similar to the example above. The definitions can be added to the site-wide configuration by adding them to any file matching *.cf in the /etc/mail/spamassassin/ directory. The rules can be added on a per-user basis by adding them to ~/.spamassassin/user_prefs.

As with RBLs, some SURBLs may be more reliable or aggressive than others. To disable testing a particular RBL, set the score for the appropriate rule to 0 in /etc/mail/spamassassin/local.cf for a site-wide change, or ~/.spamassassin/user_prefs for a change that affects only one user. In the example above, the score would be disabled as follows:

    score URIBL_SBL  0

If SpamAssassin 2.63 is in use, the SpamCopUri plug-in can be used. This is available from .  This uses a single source of URIs from spamcop.net. Consult the package documentation for more details.

Perl is required to install and use Vipul's Razor. This will already be installed as SpamAssassin uses it. A C compiler is also required, except on Debian Linux, for which a binary package is available.

To operate, Razor requires a constant internet connection. The Razor communication uses TCP port 2703, and Razor also uses TCP pings on port 7 to determine which servers are closest, so firewalls will have to be configured to enable these ports.

There are no mainstream RPM packages available for Razor. However, Razor is available in Gentoo and Debian Linux. To install in Gentoo, use the emerge razor command, and to install in Debian, use apt-get razor. On other Linux distributions and UNIX variants it can be installed from source.

Razor is available for download from the home page at . Razor is not available via CPAN. Two packages are available, razor-agents and razor-agents-sdk. Both packages should be downloaded. The razor-agents-sdk package contains Perl modules that the Razor package depends on, and should be installed before Razor.

Change to a suitable directory for building the software, and unpack the tarball. Then, make the software from the source directory:

    $ cd /some/place
    $ gunzip -c /path/to/razor-agents-sdk-m.nn.tar.gz | tar xf - 
    $ cd razor-agents-sdk-m.nn
    $ perl Makefile.PL
    $ make
    $ make test
    $ su 
    # make install

Administrative privileges are required for the make install command.

Repeat these steps for the razor-agents tarball:

    $ cd /some/place
    $ gunzip -c /path/to/razor-agents-m.nn.tar.gz | tar xf - 
    $ cd razor-agents-m.nn
    $ perl Makefile.PL
    $ make
    $ make test
    $ su
    # make install

After the Razor client has been installed, it must be configured before use. Razor stores its configuration files in ~/.razor. We configure it for the operating system account that will be used to run SpamAssassin. When SpamAssassin invokes Razor, it will be running under the same system account. Configuring Razor is a three-step process.

The first step is to create the configuration directory and some important files within it. To do this, use the razor-admin command with the -create parameter:

    $ razor-admin -create

If you receive an error message, then the razor-admin application has been unable to communicate with any Razor servers. Check that network connectivity is available and Razor ports are enabled through the firewall(s), if any. The following error occurs when the network connection is blocked by a firewall:

    nextserver: discover0: No Razor Discovery servers available at this time

If the razor-admin command worked, the ~/.razor directory will have been created with the following files in it:

    $ ls .razor 
    razor-agent.conf
    server.folly.cloudmark.com.conf
    servers.catalogue.lst
    servers.discovery.lst
    servers.nomination.lst

The second stage is to configure a Razor user. This is not an operating-system user account, but Razor's own user, for the Razor databases. Razor has two classes of users, a default user, who can only enquire if an email is spam or not, and a registered user, who can submit spam to the database in addition to testing emails. Razor tracks the emails that each user submits to the central database. If a Razor user is a repeated source of false reports (that is, reports emails as spam when they are not), then the Razor user will be revoked or disabled. False reports are highlighted when users complain that legitimate emails are identified as spam by Razor.

By default, Razor does not use its default user, and has to be explicitly configured.

Razor must be configured with a defined user before it will test for spam.

Registering users requires a network connection. To register Razor using the default user (this will only allow spam to be checked, and not submitted), use the following command:

    $ razor-admin -register -l

The command should respond with the name of the identity file:

    #Register successful.  Identity stored in /home/spam/.razor/identity-ruhct7uCxF

When Razor registers a user, it creates a file beginning with identity- followed by a seemingly random series of alphanumeric characters. It also creates a link to this file, named simply identity. This link can be manipulated to switch between users without deleting or editing files.

Only a non-default user can submit spam to Razor. Razor can provide a username and a password or just a password with a supplied username, or it can use a supplied username and password. The password is not required in normal operation, and there is little value in choosing one. The username and password are stored in plaintext in a file in the .razor directory.

The -user {username} and -pass {password} parameters should be passed to the razor-admin command if a particular value is required. The following command will attempt to register the username my_username with the password my_password:

    $ razor-admin -register -user my_username -pass my_password

If the username has been registered already, you may see an error message:

    Error 210: User exists. Try another name. aborting.

If this occurs, choose another username. Usernames are never publicly seen, and choosing a specific username should not be particularly important. If any other error occurs when registering a user, consult the online documentation available on the Razor website.

Identity files created with the razor-admin -register command can be copied from account to account and from machine to machine, as required. Copy the identity-* file created with razor-admin -l to the other machines or directories, and ensure that the file has a link pointing to the real identity file.

    $ cd .razor
    $ cp /path/to/ identity-ruhct7uCxF .
    $ unlink identity
    $ ln identity-ruhct7uCxF identity
    $ ls -l identity
    lrwxrwxrwx  1  spam   users       19 May  7 18:58 identity -> identity-ruhct7uCxF

The third and final step in Razor configuration is to tell the client software to discover the nearest Razor server. This is done with the -discover flag of the razor-admin command.

    $ razor?admin -discover

This command will query a known website, retrieve the current servers, and choose the one with the lowest network latency.

The server locations change very infrequently. However, a weekly cron job can be configured to run the discover command again. This might look like:

    # run razor-admin () discover in case servers change
    3 3 * * 1  razor-admin -discover

It is important that this is installed in the crontab of the system account that is used to run SpamAssassin.

In releases of SpamAssassin prior to 2.63, Razor was detected and used automatically. In subsequent releases, Razor has to be explicitly configured. Configuration settings are modified in /etc/mail/spamassassin/local.cf for a site-wide change, or ~/.spamassassin/user_prefs to change the settings for a particular user. Add the following entries:

    # Use Vipul's Razor?
    use_razor2 1

    # path to Vipul's Razor config file
    razor_config /home/user/.razor/razor-agent.conf

Ensure that the path in the razor_config line is correct.

To disable Razor, set use_razor2 to 0 in ~/.spamassassin/user_prefs for an individual user, or in /etc/mail/spamassassin/local.cf for a site-wide change:

    # Use Vipul's Razor?
    use_razor2 0

SpamAssassin can use the following configuration directives, placed in ~/.spamassassin/user_prefs or /etc/mail/spamassassin/local.cf.

Directive	Purpose
use_razor2 XE	Turn Razor on (1) or off (0)
razor_timeout	Time in seconds to wait for a response from Razor. The default is 10. If Razor is unavailable, all emails will be delayed for the specified number of seconds before Razor processing is abandoned and other SpamAssassin tests completed.

Directive

Purpose

use_razor2 XE

Turn Razor on (1) or off (0)

razor_timeout

Time in seconds to wait for a response from Razor. The default is 10. If Razor is unavailable, all emails will be delayed for the specified number of seconds before Razor processing is abandoned and other SpamAssassin tests completed.


The following configuration directive can only be set by administrators and should be placed in /etc/mail/spamassassin/local.cf.

Directive	Purpose
razor_config	Location of Razor configuration files, if different from the default of ~/.razor/. Altering this value would allow several users to share a common Razor configuration.

For the initial installation, just turn Razor on with use_razor2.

Although Razor keeps a log of activity, this feature is not used with SpamAssassin. When SpamAssassin finds a message that Razor identifies as spam, it will fire the RAZOR2_CHECK rule, and the following will be listed in the X-Spam-Report header line of the email:

    X-Spam-Report:
     * 1.0 RAZOR2_CHECK Listed in Razor2 ()

After a period of time has passed and a significant number of spam messages have been received, then statistically Razor should have located at least one message. After around 100 new spam messages, there is a good chance that Razor has detected one or more. Unless the default headers that SpamAssassin uses have been changed, SpamAssassin will report the tests fired in the message header (see Chapter 10 for details of how to change the headers that SpamAssassin creates). Check email mailboxes or maildirs for an email that has the RAZOR2_CHECK. Use the following command to search a maildir:

     # find ~/.maildir -exec grep RAZOR2_CHECK {} \;
            *  1.0 RAZOR2_CHECK Listed in Razor2 ()

If no email is found, check that Razor can communicate with the network. To do this, issue the following command for an email, ideally a spam email, when logged on as the system account that is used to run SpamAssassin:

    # razor-check -d < /path/to/file | grep "known spam"
    May 10 11:36:01.208055 check[29793]: [ 3] mail 1 is known spam.
    May 10 11:36:55.383525 check[29807]: [ 3] mail 1 is not known spam.

If there are no results, run the following command, which gives more output. The output may assist you in diagnosing an error:

    # razor-check -d < /path/to/file

When problems cannot be resolved by interpreting the debug output, there is a support forum on the Razor website at . This should be searched thoroughly before raising a new support request, to avoid wasting the time and experience that volunteers and enthusiasts provide freely to help others.

Razor scoring is altered in the same manner as for other SpamAssassin tests. The rule used to test Razor is called RAZOR2_CHECK, and it has a score associated with it in /usr/share/spamassassin/50_scores.cf, which can be overridden site-wide in /etc/mail/spamassassin/local.cf, or for a particular user in ~/.spamassassin/user_prefs.

Pyzor is written in Python, and so the Python language needs be installed. This is included with most modern Linux distributions and is available for other operating systems including AIX, Solaris, and HP/UX. Pyzor source is packaged in a tar.bz2 file, using the bzip2 compression scheme. A bunzip2 program is required, and is installed on most Linux distributions. Binary bunzip2 utilities for other UNIX-like operating systems can be downloaded from the Internet.

Pyzor uses TCP port 24441 for communicating with a server, so any firewall must be configured to allow outgoing connections on that port.

Pyzor is available in RPM format only for Mandrake Linux. The rpm -i command can be used to install the RPM once it is downloaded. Packages are also available for Gentoo Linux and Debian Linux. Use emerge pyzor or apt-get pyzor respectively.

For all other distributions and operating systems, Pyzor should be installed from source. Pyzor can be downloaded from the Pyzor website at .

Change to a suitable directory for building the software, and unpack the tarball. Then, as root, change to the source directory and make the software. The INSTALL file lists the commands that need to be used.

For Pyzor version 0.4.0, the commands are listed below:

      $ cd /some/place
    $ bunzip2 -c /path/to/pyzor.x.y.z.tar.bz2 | tar x
    $ cd pyzor-x.y.z
    $ python setup.py build
    $ su - 
    # python setup.py install

This will install Pyzor. Administrative privileges are required for the python setup.py install command. The default, the location for the client is /usr/bin/pyzor:

    $ which pyzor
    /usr/bin/pyzor

Pyzor will work without any further configuration. Like Razor, it allows anonymous connections to check email. However, unlike Razor, it also allows anonymous connections to report email.

Pyzor does support accounts, but creating them is not a totally automatic process. For more details on creating accounts, refer to the Pyzor documentation included in the source tarball.

In a typical installation, Pyzor will be installed into the system path and SpamAssassin will detect Pyzor and automatically use it.

The following Pyzor configuration directives can be placed in  ~/.spamassassin/user_prefs or /etc/mail/spamassasin/local.cf:

Directive	Purpose
use_pyzor	Turn Pyzor on (1) or off (0).
pyzor_timeout	Time in seconds to wait for a response from Pyzor. The default is 10. If Pyzor is unavailable, then all emails will be delayed for the specified number of seconds before Pyzor processing is abandoned and other SpamAssassin tests completed.
pyzor_max	Defines the threshold of how many times a message should be reported to Pyzor before SpamAssassin considers it spam. The default is 5. Altering this value to make it lower may cause non-spam emails to be identified as spam, and higher values may prevent spam from being identified.
pyzor_options	Allows extra parameters to be passed to Pyzor when invoked from SpamAssassin.


The following directives can only be set by an administrator and should be placed in /etc/mail/spamassasin/local.cf:

Directive	Purpose
pyzor_path	Location of Pyzor configuration files. This setting might be used if several users were to share the same Pyzor configuration settings.

For the initial installation, no changes are required, but it is advisable to explicitly turn Pyzor on with use_pyzor.

To confirm that Pyzor is being used, enable the Pyzor email header, as  described in the Pyzor Headers section and restart spamd, if used. Send a test email or wait until emails have been received, then examine the headers of new emails for X-Spam-Pyzor:

    X-Spam-Pyzor: Reported 0 times.

If the header does not appear in new emails, then Pyzor is not being invoked. To confirm that Pyzor is installed, available, and configured, use the command-line. Use su to gain the privileges of the user account that is used to run SpamAssassin, and pipe the sample spam provided in the SpamAssassin distribution through the pyzor command:

$ cd /path/to/spamassassin
$ pyzor -d check < sample-spam.txt
calculated digest: d152948f7f029b35691afa499c145797558b2fff
sending: 'User: anonymous\nTime: 1090877637\nSig:   1b90084a35991758bfe310635cb0b548f7e5460a\n\nOp:     check\nOp-Digest: d152948f7f029b35691afa499c145797558b2fff\nThread: 3702\nPV: 2.0\n\n'
received: 'Thread: 3702\nCount: 24\nWL-Count: 0\nCode: 200\nDiag: OK\nPV: 
2.0\n\n'
66.250.40.33:24441      (200, 'OK')     24      0

If the results show a sending: and a received: line, then Pyzor is operating correctly from the command line, and so, the problem must be in the integration with SpamAssassin. As this is a SpamAssassin issue, help should be sought from the SpamAssassin website or mailing lists. It is always best to search all archives before posting to discussions groups or mailing lists.

If Pyzor fails to work, the problem is a Pyzor configuration issue. The Pyzor website has links to the archive of the pyzor-users mailing list, which is a searchable archive.

Pyzor scoring is altered in the same manner as for other SpamAssassin tests. The rule used to test Pyzor is called PYZOR_CHECK and it has a score associated with it in /usr/share/spamassassin/50_scores.cf, which can be overridden site-wide in /etc/mail/spamassassin/local.cf, or for a particular user in ~/.spamassassin/user_prefs.

To disable Pyzor, set use_pyzor to 0 in ~/.spamassassin/user_prefs: or /etc/mail/spamassassin/local.cf:

    use_pyzor    0

Pyzor Headers

Versions of SpamAssassin before 3.0 used the pyzor_add_header configuration directive to add a header to emails. This has been deprecated, and will be removed in future versions of SpamAssassin. The current method is to add an email header with Pyzor information. This is achieved by adding the following to /etc/mail/spamassassin/local.cf or ~/.spamassassin/user_prefs:

    add_header all Pyzor _PYZOR_

Although the correct term is 'The Distributed Checksum Clearinghouse', it is referred to as DCC here to enhance readability. DCC is the most effective network service, but also the most complex.

DCC is written in C. To build from source (binary packages are rare) a C compiler is required. DCC uses UDP port 6277 to communicate with servers, so this should be enabled through any firewall in use.

DCC is available in RPM format for Mandrake, but for no other RPM-based distribution. Use the rpm -i command to install it. DCC is available in Gentoo Linux and Debian Linux; use emerge net-mail/dcc under Gentoo, and apt-get dcc-client in Debian. For other distributions and versions of UNIX, DCC should be installed from source.

The source for DCC can be downloaded from .

The source is packaged as a tar file. Unpack this and then run the configure script. This script will automatically detect any required software libraries or inform if they are missing. Then, use make to build and install the software. The make install command should be run as root.

    $ cd /some/place
    $ uncompress -c /path/to/dcc-dccd-a.b.c.tar.Z | tar x
    $ cd dcc-dccd-a.b.c
    $ ./configure
    $ su 
    # make install

If there are problems when building DCC, it is best to turn to the DCC website. There is an FAQ and a mailing list archive on the site. Always read the FAQ and search the mailing list archive before posting for help.

By default, DCC will create a usable configuration. DCC uses user identities in a similar way to Razor, and DCC also has a default user who can be used to query spam.

The DCC network is large, and only some of the machines are available for public use. Others are provided and used exclusively by organizations such as ISPs and spam-filtering providers.

To use DCC with SpamAssassin, add the following line to ~/.spamassassin/user_prefs:

    use_dcc 1

The following configuration directives can be used in /etc/mail/spamassassin/local.cf or ~/.spamassassin/user_prefs:

Directive	Purpose
use_dcc	Turn DCC on (1) or off (0).
dcc_timeout	Time in seconds to wait for a response from DCC. The default is 10. If DCC is unavailable, all emails will be delayed for the specified number of seconds before DCC processing is abandoned and other SpamAssassin tests completed.
dcc_body_max dcc_fuz1_max dcc_fuz2_max	Thresholds of the number of times the email has been reported to DCC. Fuz1 and Fuz2 are different methods that DCC uses to mark the message. For more details, consult the DCC documentation.
dcc_dccifd_path	Location of socket to communicate with the DCC daemon dccifd.


For the initial installation, just turn DCC on with the use_dcc command.

DCC is a complex system of programs and networks, and rewards further investigation. The following configuration directives can only be specified by an administrator, and should be placed in /etc/mail/spamassassin/local.cf:

Directive	Purpose
dcc_home	Location of the home directory for DCC.
dcc_path	Location of the dccproc client. Specifying this may result in slightly better performance while running dcc. This must be specified if dccproc is not in the system path.
dcc_options	Options to be passed to the DCC client, dccproc.

To confirm that DCC is being used, enable the DCC email header as described in the DCC Headers section and restart spamd, if used. Send a test email or wait until emails have been received, and then examine the headers of new emails for X-Spam-DCC. Note that the header may be split across two lines.

An example header is shown below:

    X-Spam-DCC: EATSERVER: host.domain.com 1166; IP=ok Body=1 Fuz1=1200 Fuz2 = many

If the header does not appear, then DCC is not being invoked. To confirm that DCC is available from the command line, use the system account that is used to run SpamAssassin, and create a file mailmessage that contains a complete email message, including headers. Issue the following command and examine the results:

$ dccproc < mailmessage | grep DCC
X-DCC-EATSERVER-Metrics: host.domain.com 1166; Body=2 Fuz1=2 Fuz2=2

If the results show the X-DCC-EATSERVER line, then DCC is operating correctly from the command line, and any problem must lie in the integration with SpamAssassin. As this is a SpamAssassin issue, help should be sought from the SpamAssassin website and mailing lists. It is always best to search any archives before posting to discussions groups or mailing lists.

If DCC fails to work even from the command line, then the problem lies in DCC configuration. The DCC website has links to an FAQ, and a mailing list archive, which should be read and searched before posting on the mailing list for help.

DCC scoring is altered in the same manner as for other SpamAssassin tests. The rule used to test DCC is called DCC_CHECK and it has a score associated with it in //etc/mail/spamassassin/local.cf, which can be overridden for a particular user in ~/.spamassassin/user_prefs.

To disable DCC tests, set use_dcc to 0 in /etc/mail/spamassassin/local.cf or ~/.spamassassin/user_prefs.

    use_dcc 0

Versions of SpamAssassin before 3.0 used the dcc_add_header configuration directive to add a header to emails. This has been deprecated, and will be removed in a future version of SpamAssassin. The current method is to add an email header with DCC information. To do this, add the following to /etc/mail/spamassassin/local.cf or ~/.spamassassin/user_prefs:

    add_header all DCC _DCCB_: _DCCR_

A spamtrap is an email address that has never been associated with a real person role in a company. The spamtrap is placed on web pages in such a way that it can only be picked up by spammer web spiders. When email is received at the spamtrap address, it can only be spam, and so the email can be sent to the Razor network as definite spam.

Normally a spamtrap is hidden from view by using a tiny font, by hiding the email address behind another element of the page, by using the same color for the text and the background, or by another technique). The spammer's web spider will nevertheless detect the email address and add it to its database of valid email addresses.

Spamtraps can also be added to postings on Usenet, as long as it is made clear that the email address should not be used for real replies.

A spamtrap address should be made of completely random characters. Using an address such as info@domain.com, contact@domain.com, or other popular generic addresses is dangerous, even if they have never been advertised or used.

Similarly, email addresses that look like real email addresses, such as those of the format firstname.lastname@domain.com should not be used. It's not uncommon for people to make mistakes when typing email addresses. There could be a legitimate user with a domain name similar to the one used in the spamtrap, and one day, an organization may hire someone with the same name as the spamtrap address.

Addresses such as DFERQFER@domain.com and QT56.HYR5@domain.com are ideal. The rules for a valid email address are that the name should start with an alphabetic character, and continue with alphanumeric characters, or the characters '.' and '-'. Some users prefer to use spamtrap email addresses that indicate to a human they are spamtraps, for example spamtrap@domain.com. It is difficult to know if this will affect the email addresses used by spammers.

The next step is to add the spamtrap address to a web page, so that spammers' web spiders can find it. As mentioned above, there are many ways to do this, and only one method is shown here.

To complete this task, you should be familiar with editing HTML and publishing it to a website.

Take the existing HTML page and edit it with a text editor such as Notepad, Vi, or Emacs. An HTML editor such as Microsoft FrontPage or Macromedia Dreamweaver can also be used.

Add the following style definition to the top of the HTML document, just after the  element:

This defines a style called hidden. When this style is used, anything using the style will be hidden. Next, add the email address just after the  tag:

    
    
    Please do not use this address

This defines a block that uses the hidden style. Within the block is a normal mailto: link, containing the spamtrap email address. As screen readers used by blind or partially sighted people may display or read the hidden text, the warning is added to the link. Anyone using a very old web browser may also see the text.

The web page should be saved and published on a live site. The presence of the spamtrap can be confirmed by visiting the page with a web browser and viewing the page source.

The spamtrap email account should be a regular system account. The processing of email, however, is not like that for other users. Instead of filtering and processing email, the user account should be configured to the report feature of SpamAssassin to automatically report any emails as spam.

The .procmailrc file should look like this:

:    0 
    | /usr/bin/spamassassin -r

This procmail recipe uses the -r flag of the spamassassin client. This flag is only available in spamassassin, and not in the spamc client, so even if spamd is running, spamassassin should be used. The -r flag instructs SpamAssassin to report the email to the DCC, Razor, and Pyzor, if they are enabled.

This recipe accepts the default delivery for the spam emails. They can be used as a corpus for training the Bayesian filter or recalculating rule scores.

The user_prefs file for the spamtrap user should enable DCC, Pyzor, and Razor:

    use_pyzor 1
    pyzor_path /usr/bin/pyzor
    use_dcc 1
    use_razor2 1
    razor_config /home/spamtrap/.razor/razor-agent.conf

Razor, Pyzor, and DCC should be configured correctly to send spam reports to the various services as described in this chapter.

Network tests allow a site to benefit from other sites reporting email relays and spam-advertised websites. SpamAssassin includes support for RBLs and SURBLs. The latter provide a promising new technique against spam, which works by detecting the URIs that are advertised in spam emails. RBLs, Razor, Pyzor, and DCC are email comparison systems. DCC is considered the most effective. These tests can be used together and most settings are configurable on a site-wise or 
per-user basis.

阅读(1887) | 评论(0) | 转发(0) |

上一篇：Installing Simscan prerequisites

下一篇：install razor&pyzor&dcc

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6