|
Problem |
What is the issue and what problems can it cause? Time Creep can occur on a server or workstation that is sustaining a high throughput of activity for a sustained period of time. This causes certain key time values in a Notes database to be in the future which can lead to documents not appearing in views and folders, and can cause databases to not replicate. This issue is more prevalent on, but not limited to, multiprocessor machines with faster processors. Because of the nature of what servers and workstations are used for, this issue tends to occur mostly on servers and rarely on workstations. |
|
Solution |
The majority of conditions that would cause time creep have been addressed in Domino/Notes Release 5.0.2, with additional fixes in 5.0.11 and the upcoming 5.0.12 scheduled for release in Q1-2003.
Why does this occur? There is a need in the product to generate Unique Time Values for things such as Replica ID's and document UNID's. To implement this unique time/date, a function was created that would increment a time value if the current time returned is not unique. The granularity of time in the Notes products is 1/100th of a second. Therefore this unique time value would be increased by 1 tick when it is not unique. 1 tick=1/100th of a second=10 milliseconds. As the product progressed, many different areas of the product were written to call upon the same code for their own unique calculations. We also dramatically improved the throughput for many operations throughout the product, most notably view rebuild/updates and replication speed. This coupled with the dramatic advances in processor speed at a rate much higher than anyone predicted, have exposed this as a major limitation.
This main issue is that if the internal Notes function OSCurrentTIMEDATEUnique is called more than once per tick for a sustained period of time, the PrevUniqueTIMEDATE value will be in the future. The result is that anyone calling this function will be returned a future time, causing problems. One of the visible symptoms is that the Created Time on documents created during this time, is in the future, but the modified time, operating system time, and server time are all current and correct. During slow or idle periods on the Notes Server, the time will eventually catch up so that the internal Unique Time value will be current. The amount of time it takes to catch up depends how long the server remains idle. One thing that is certain, is that it will take at least as much time as it is in the future. For example, if the time is one hour into the future, it will take at least one hour to catch up. When this problem occurs, typically the amount of time it takes to put the server into the future, is much shorter than the amount of time it takes to catch up.
Prior to 5.0.2, time creep problems would always be experienced server wide. In Domino 5.0.2, changes were implemented to restrict time creep problems to individual databases by keeping track of unique time values on a per database basis. There are still some areas post 5.0.2 where OSCurrentTimeDateUnique is utilized server wide but the majority of customers will not encounter this.
How can I determine I am experiencing this problem? To check for database specific timecreep:
The most basic method is to examine the document properties, and see if the Created is ahead of the Modified time. This is only a good test if the document was created on the same server. For example, this is not a good test with a document that was mailed, because the created time could have been derived on a different server whose clock may be ahead of the recipient server.
To check for server wide timecreep:
Add a column to the Miscellaneous Events view of the Log that will compare modified to created. The formula to use is:
(@Created - StartTime)/60
This will show you, in minutes, how far into the future the created is ahead of the modified. Reviewing what server activities took place around the time that the time creep occurred, can also help pinpoint what activities had the most impact on causing time creep. If the log file needs to be sent in to your support provider for analysis, it is important to get an operating system copy of the LOG.NSF so that the created and modified times remain the same. Copying and pasting documents, or pulling a new replica will taint the data.
What is Lotus Doing about this? We have put a fix in 5.0.2 to address the bulk of time creep issues. This was a major low-level architectural change that we were unable to port back to 4.6.x. However, customers running into time creep who have upgraded to at least 5.0.2, have found this fix successful. Additional fixes are in 5.0.11 and are scheduled for the upcoming 5.0.12.
We have already implemented a series of incremental fixes in maintenance releases of 4.5.x and 4.6.x that will help minimize the occurrence of this issue. There was a Compact-related fix (SPR #SVRO3Z8QTH) fixed in Notes/Domino QMR Releases 4.5.7 and 4.6.3. There were also a series of fixes in 4.6.4a and 4.6.5, covered under SPR #SVRO465TH6. This later fix was only resolved in the 4.6x line, and still exists in 4.5.7x. So, to fully realize the most current fixes under 4.x, upgrade to at least 4.6.5. To truly address the issue, you should upgrade to at least 5.0.2 and preferably the latest 5.x or 6.x maintenance release currently available.
The code changes in 5.0.2 localize most time creep to the database level, and prevent creep from affecting processes such as replication and view indexing. What this means is that rather than having creep in multiple active databases affecting all databases in an additive manner at the server level, each database is affected only by its own activity, potentially resulting in much smaller creep, localized to the individual database. Additionally, the creep in the individual database should not affect replication or indexing of this database.
One area that will still potentially be a problem would be programmatic activity that relies on the created time of a document, however several programmatic methods to circumvent this are available, such as the use of @Now in a computed when composed field on the document.
What can I do to help minimize the impact of this problem?
- Upgrade to the latest release of Notes/Domino that includes the latest fixes surrounding this issue. 5.0.2 is the release that has the major fix with 5.0.11 and 5.0.12 providing several more. 4.6.5 is the latest release of 4.x that has all the latest fixes possible in that code stream.
- Try to identify which tasks or operations are causing the creep. Try to identify whether it is one or more agents or API programs processing many documents that is causing the problem or if it is related to certain simultaneous tasks. Add the view column to the Log described above in the "How do I determine..." Section.
- Once tasks are identified, try to slow them down or schedule them so they are not occurring simultaneously with other intensive operations. For example, if concurrent agents are causing the problem, run fewer executive processes by modifying the server document. If a singular agent causes the problem, then add pause code to slow the agent down, particularly in operations where creation or modification of documents is occurring. Refer to the document titled "How to Have a Script in Notes Pause for Some Time and Then Continue" (#1088590) for a LotusScript pause example. The same slowdown method can be applied to API programs.
- Bring down the server and rename LOG.NSF and MAIL.BOX. These two files will be recreated automatically when the server comes back up. We have seen that when time creep happens on a server, leaving the MAIL.BOX and LOG.NSF in place compounds the time creep problems. So it is recommended that the server is brought down, and these files are renamed.
- Lower purge interval to carry fewer deletion stubs in a database. There were a series of fixes surrounding deletion stubs and time creep put into 4.6.5, but there are still areas we were unable to resolve. By default, the purge interval is 90 days. If your replication cycle is shorter than this, you can lower the purge interval to carry fewer deletions in a database. Refer to the document titled "Q&As About Replication Purge Intervals and Cutoff Dates" (#1110117) for more details on how to do this.
- We have had success migrating a server to a slower, less-powerful machine to minimize time creep i.e. Dual Pentium 200mhz instead of Quad Pentium 400mhz.
- Add debug code to the NOTES.INI to generate more log output which will slow the server down. The more logging a server does, the slower it runs. We have had some success by adding additional logging parameters to the NOTES.INI. The parameter Debug_Nif_All=1 generates quite a bit of output and can be used to try to slow down operations. This generates a lot of output so check the size of the Log often, and verify disk space is plentiful. The amount of data generated with this parameter will be directly related to the amount of view updates that are performed on the server
This issue occurs on all platforms.
If there are any reproducible scenarios not already covered in responses to SPR #SVRO465TH6, please obtain the required steps and file an additional response to the SPR detailing exact steps to reproduce.
Supporting Information:
Areas Affected by Future Time:
- Created Time of new documents
- Modified in this File time of documents
- Created Time of new databases
- Modified Time of databases
Areas Not Affected by Future Time:
- Operating System Time
- Domino Server Console Time (sh server at console)
- Modified Time of Documents
- Added to File Time of Documents
Post 5.0.2 time creep sprs SPR Fixed Area Technote CBOE57YQU7 5.0.12 DBOpens TDEN5AUJRU 5.0.12 NifOpenCollection CBOE57YQKA 5.0.11 Deletions SEGN56VHAE open multiple replicators TDEN5C5KVR open stats MCOT579RZ7 6.0/5.x open SCOS
| | | | |