70-562 Reading XML

System.Xml.XmlDocument provides methods required to access xml content directly.

We will cover reading tree-based xml, in-memory.
It is also possible to work with XMLReader/XMLWRitter classes.

THE XML Document Object Model is Tree-Based, in-memory cache, represented by the XmlDocument class.

Non-cached, forward-only, read-only accesss is done using XMLReader/XmlWriter classes.

It is also possible use Linq to XML to work with XML.

The XmlDocument class loads XML as a tree, that has a root node and mutliple child nodes.
Each Xml element corresponds to one or more nodes in the tree.
Each node except for the root node has a parent node.
All the data is loaded in memory, provides random access to individual nodes and can search for any node at will.
Three based structure represents the xml infoset. This is the data stored in teh serieallized XML format.
The API doesn't care about angle brackets and quotes, just data and it's relationship. There is a difference between the serialized xml stored on disk, and the representation in memory. You dont want to work with quotes and brackets you want to work with the in-memory tree representations.

Non-CAched XML handling, allows fast access requires less overhead, but donesnt allow radnom access to data, it's sequential access, data flows to the application in the order it appears.

Adding text within an element in the serialized reprentation will correspond to a child element in memory, this is the case for any xml element, that's how it's parsed by the xml DOM, and are represented by #Text in memory.

An attribute can't contain other attributes or other elements.

Load xML data: need to load xml data into an XmlDocuemnt object, parses persisted XML into tree-based structure in memory. This is done by first creating an XmlDocuemnt object then calling the Load method prividing a filename or a stream.

Once it's loaded it's possible to work in the tree format. To view the xml retrived in serialized format, it's possible to access the OuterXML property.
It's possible to use PreserveWhitespace property to keep the white spaces obtained from the serialized code.
The root node always contains an XML delcaoraiton, Comment, and the document element. Every element is based on the XMLNode. The docuement element equals the root node of the document.

Use the HasChildNodes property or ChildNodes property to determine and access child nodes of an element. There is no recursion.
ChildNode.Count allows to know how many child nodes. and you can use the XMLNodeType if you're looking for specific types like text types.
You can use doc.GetElementsByTagName(name) which returns an XMLNodeList that you can iterate through.

For more general critera, use SelectNodes, or SelectSingleNode methods of XmlNode, using XPath expression.
In xpath you use "//" to search at any level beneath, and use "/" at only one level beneath.
You can use attributes as criterions in XPath, //Department[@Name="Fruits"]/ would return any node that has the Name attribute set as Fruits.
You can also use //Grocery//Department, and it would look for Department elements at many levels bellow the Grocery node.
Use XPath expression to search for specific nodes and node attributes, it will allow to retrive XMLNodeLists.

If the content includes a namespace, use the XMLNamespaceManager, otherwise a namspaced document will not return any XMLNodeLists. Even the default namespace needs to have a prefix configured in the xml namespace manager.

XMLNode.SelectSingleNode will retrive the first matching node. Once you found a node, you mgiht want to navigate to related nodes, use the ChildNodes property, the FirstChild/LastChild PreviousSibling/NextSibling, ParentNode properties.
You can use XmlNode.Attribute that returns a AttributeCollection.

Generic35's Blog

Search This Blog

70-562 Reading XML

Comments

Post a Comment

Popular posts from this blog

React JS Patterns

Rxjs Forkjoin vs Zip

ES6: object literal property shorthands