分类:
2009-04-11 07:43:48
A while ago I posted a that could be used to highlight certain matches within a document. It uses a regular expression to replace the innerHTML
property of the specified container. Since then, because of and various other things I’ve read, I’ve come to realize that it’s just not a solid solution and doesn’t cut it for realistically complicated websites.
The only viable solution is to progressively walk the DOM tree, and only stop for text nodes (nodeType = 3
), and then apply the conventional ‘replace’ to each of those nodes.
The process is as follows:
Here’s the function itself ():
function findAndReplace(searchText, replacement, searchNode) { if (!searchText || typeof replacement === 'undefined') { // Throw error here if you want... return; } var regex = typeof searchText === 'string' ? new RegExp(searchText, 'g') : searchText, childNodes = (searchNode || document.body).childNodes, cnLength = childNodes.length, excludes = 'html,head,style,title,link,meta,script,object,iframe'; while (cnLength--) { var currentNode = childNodes[cnLength]; if (currentNode.nodeType === 1 && (excludes + ',').indexOf(currentNode.nodeName.toLowerCase() + ',') === -1) { arguments.callee(searchText, replacement, currentNode); } if (currentNode.nodeType !== 3 || !regex.test(currentNode.data) ) { continue; } var parent = currentNode.parentNode, frag = (function(){ var html = currentNode.data.replace(regex, replacement), wrap = document.createElement('div'), frag = document.createDocumentFragment(); wrap.innerHTML = html; while (wrap.firstChild) { frag.appendChild(wrap.firstChild); } return frag; })(); parent.insertBefore(frag, currentNode); parent.removeChild(currentNode); } }
No library or framework is required to use this function, it’s entirely stand-alone. The function requires two parameters, the third one is optional:
searchText
- This can either be a string or a regular expression. Either way, it will eventually become a RegExp
object. So, if you wanted to search for the word “and” then that alone would not be appropriate - all words that contain “and” would be matched so you need to use either the string, \\band\\b
or the regular expression, /\band\b/g
to test for word boundaries. (remember the global flag)
replacement
- This parameter will be directly passed to the String.replace
function, so you can either have a string replacement (using $1, $2, $3 etc. for backreferences) or a function.
searchNode
- This parameter is mainly for internal usage but you can, if you so desire, specify the node under which the search will take place. By default it’s set to document.body
. A typical example would be when highlighting search keywords, here’s how that would work:
// Just an example: var searchMatch = document.referrer.match(/[?&]q=([^&]+)/), searchTerm = searchMatch && searchMatch[1]; if (searchTerm) { findAndReplace('\\b' + searchTerm + '\\b', function(term){ return '' + term + ''; }); }
As I said, a string can be passed as the second parameter and you can use ‘$1, $2 etc.’ for backreferences:
findAndReplace('(microsoft|apple|sony)', '$1');
You’ll notice that within the function there’s an ‘excludes’ string that contains a comma-seperated list of node-names to exclude from all searches. You can add and take away from this list as needed.
Porting over to MooTools or jQuery is quite pointless because neither library offers anything in the way of text node traversal, but feel free to wrap it all up in the respective namespace.
One notable limitation is that the function cannot search for text nested between seperate nodes, for example, searching for “pineapple” in the following HTML would not work:
We ate mango, pine<strong>apple</strong> and passion fruit!!
I’ve tried to find ways around this but it seems a lost cause.