DOMParser & SymbolTable [Archiv]

nul

23-05-2006, 16:50

Beschaeftige mich in letzter Zeit ein wenig mit Xml parsen in Java.
Verwnde dazu den DomParser
http://xerces.apache.org/xerces2-j/javadocs/xerces2/org/apache/xerces/parsers/DOMParser.html#DOMParser(org.apache.xerces.util.Sy mbolTable)
Jetzt meine Frage, die SymbolTable, zu was ist di gut?
Ich hab ein wenig gesucht aber nichts gefunden.

Hat da jemand ne Idee?

bischi

23-05-2006, 17:08

public class SymbolTable
extends java.lang.Object

This class is a symbol table implementation that guarantees that strings used as identifiers are unique references. Multiple calls to addSymbol will always return the same string reference.

The symbol table performs the same task as String.intern() with the following differences:

* A new string object does not need to be created in order to retrieve a unique reference. Symbols can be added by using a series of characters in a character array.
* Users of the symbol table can provide their own symbol hashing implementation. For example, a simple string hashing algorithm may fail to produce a balanced set of hashcodes for symbols that are mostly unique. Strings with similar leading characters are especially prone to this poor hashing behavior.

An instance of SymbolTable has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the SymbolTable, and the initial capacity is simply the capacity at the time the SymbolTable is created. Note that the SymbolTable is open: in the case of a "hash collision", a single bucket stores multiple entries, which must be searched sequentially. The load factor is a measure of how full the SymbolTable is allowed to get before its capacity is automatically increased. When the number of entries in the SymbolTable exceeds the product of the load factor and the current capacity, the capacity is increased by calling the rehash method.

Generally, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the time cost to look up an entry (which is reflected in most SymbolTable operations, including addSymbol and containsSymbol).

The initial capacity controls a tradeoff between wasted space and the need for rehash operations, which are time-consuming. No rehash operations will ever occur if the initial capacity is greater than the maximum number of entries the Hashtable will contain divided by its load factor. However, setting the initial capacity too high can waste space.

If many entries are to be made into a SymbolTable, creating it with a sufficiently large capacity may allow the entries to be inserted more efficiently than letting it perform automatic rehashing as needed to grow the table.

Hilft das weiter?

MfG Bischi

nul

23-05-2006, 18:07

Nicht wirklich, das hab ich mir schon durchgelesen.
Ich versteh nicht wie der Parser diese Tabelle verwendet.
Hast du evtl. ein kleines Beispiel?

bischi

23-05-2006, 19:28

Auf die Gefahr hin, jetzt nur Müll zu erzählen:

This class is a symbol table implementation that guarantees that strings used as identifiers are unique references. Multiple calls to addSymbol will always return the same string reference.

afaik werden in XML die Schlüsselwörter selbst definiert (analog wie die festdefinierten in HTML: <body>,...). Soweit ich das jetzt verstehe, kümmert sich diese Klasse darum, dass du solche Schlüsselwörter nur einmal benutzt. Für genauere Angaben kann ich zu wenig XML...

MfG Bischi