Wednesday, 1 May 2013

XML tutorial


Welcome to XML and related specifications

XML is the World Wide Web Consortium's (W3C) specification for interchanging structured data in Web applications. An XML document enables you to store data in the same way a database enables you to store data. However, unlike databases, an XML document stores data in the form of plain text, which can be understood by any type of device, whether it is a mainframe computer, a palmtop, or a cell phone. Thus, XML serves as a standard interface required for interchanging data between various Web applications.

Q1) XML is a markup language that enables you to enclose data within tags. So how is it different from HTML?

Answer: - There are many reasons which create uniqueness for XML with respect to the HTML.

The difference lies in the fact that HTML has a set of predefined tags. Thus, tags in XML serve the purpose of structuring the content within the XML document. No presentation or appearance is associated with any of the XML tags. XML does not provide any predefined set of tags. Rather, it enables you to create your own tags. In that sense, XML can be called a meta-markup language, which enables you to create your own markup or vocabulary.

In fact, many existing markup languages have been derived from XML. Some examples of markup languages that are based on XML are Wireless Markup Language (WML), which is used to create Web applications that can be accessed using a cell phone, and MathML, which is used to represent mathematical equations.

The XML document is displayed in the form of a tree view, which can be expanded and collapsed.

While creating an XML document, you must remember some basic rules:

# All tags must be closed. In HTML, even if you don't close a tag, it does not give you any errors. However, in XML, all opening tags must have corresponding closing tags; otherwise, the browser displays an error message.



# All empty tags must include a / character before the closing angular bracket
(>). For example, in HTML, you have the <IMG> tag for inserting an image in a Web page. This tag does not require any closing tag because the tag itself contains all the relevant information, such as the image source and its position. Therefore, <IMG> is an empty tag. If you want to use an empty tag called <Image> in XML, you must include a / character before closing the angular bracket of the tag as follows:
<Image src="tree.gif" />

# Tags should not overlap; that is, the innermost tag must be closed before closing the outer tags. Consider the following code:

<FirstName> James <LastName> Ford </FirstName> </LastName>

This statement would result in an error because the outer tag, <FirstName>,
has been closed before the inner tag, <LastName>.

#Tags are case-sensitive. Therefore, the case of the closing tag should match the case used in the opening tag.

An XML document that conforms to these rules is called a well-formed XML document.


This is the best way to understand the difference between XML and HTML.
Consider the following HTML document that displays a list of items and their prices

Note: - XML is used to transport data, while HTML is used to format and display the data.
2) XML is the most common tool for data transmissions between all sorts of applications.


XML Simplifies Data Transport

One of the most time-consuming challenges for developers is to exchange data between incompatible systems over the Internet.
Exchanging data as XML greatly reduces this complexity, since the data can be read by different incompatible applications.

XML Simplifies Platform Changes

Upgrading to new systems (hardware or software platforms), is always time consuming. Large amounts of data must be converted and incompatible data is often lost.
XML data is stored in text format. This makes it easier to expand or upgrade to new operating systems, new applications, or new browsers, without losing data.

XML Makes Your Data More Available

Different applications can access your data, not only in HTML pages, but also from XML data sources.
With XML, your data can be available to all kinds of "reading machines" (Handheld computers, voice machines, news feeds, etc), and make it more available for blind people, or people with other disabilities.



HTML FORM

<HTML>
<HEAD> <TITLE> Items Data </TITLE> </HEAD>
<BODY>
<OL>
<LI> Item Name : Chocolate Price: 1 </LI>
<LI> Item Name: Cadbury Price: 2.5 </LI>
</OL>
</BODY>
</HTML>




XML FORM

<?xml version="1.0"?>
<ITEMS>
<ITEM>
<NAME> Chocolate </NAME>
<PRICE> 1 </PRICE>
</ITEM>
<ITEM>
<NAME> Cadbury </NAME>
<PRICE> 2.5 </PRICE>
</ITEM>
</ITEMS>

In the asp.net we use XML format in various forms according to conditions or properties of the asp.net topic.

 These are the main topics in asp.net which includes XML forms which covers ADO.NET, Security (authentication and authorization), web configuration, web services and many more topics in asp.net



                                   ADO.NET Basics

Microsoft ADO.NET is the latest improvement after ADO. ADO.NET provides platform interoperability and scalable data access. In the .NET Framework, data is transmitted in the Extensible Markup Language (XML) format. Therefore, any application that can read the XML format can process data. The receiving component might be a Microsoft Visual Studio–based solution or any application running on any other platform.

Like ADO, ADO.NET also allows you to access data when disconnected from actual data sources. However, unlike ADO, ADO.NET uses XML as the data format. Because XML is a universal data format being used, ADO.NET expands the boundaries of interoperability to the Internet. In addition, instead of recordsets, ADO.NET uses the DataSet and DataReader objects to access and manipulate data. Using HTTP as the transport, XML data can pass through firewalls.





Web Configuration settings:

The application-level configuration settings are stored in an Extensible Markup Language (XML) format. The XML format is a hierarchical text format, which is easy to read and write. This format makes it easy to apply new settings to applications without the aid of any local administration tools.

Security: -

Through the XML format we can easily define the security for the administrator. We can easily read, write and modify the authentication and authorization concepts according to the administrator. We all know about that it is non complied format, so it will be never executed with this we can maintain the security for the administrator.

Web Services: -

Web service is a programmable URL. Stated another way, a Web service is an application component that is remotely callable using standard Internet protocols such as HTTP and XML.XML is the best way to transfer the data in different interface through the concept of COM and DOM. Thus, any system that supports these basic, standard protocols is capable of supporting Web services.

An Overview of XML-Related Specifications

XML does not exist all by itself. Numerous additional XML-related specifications provide guidelines for working with XML documents. Before discussing the implementation of XML in ASP.NET, it is important to understand these XML-related specifications. Therefore, this section looks at some of the important XML-related W3C specifications.

Document Type Definition

A Document Type Definition (DTD) enables you to specify the structure of the content in an XML document. Creating a DTD is similar to using a CREATE TABLE statement in SQL, in which you specify the columns to be included in the table and whether they can hold null values. In a DTD, you can specify the elements that can be used in an XML document and specify whether it is mandatory to provide values for the elements.
When you include a DTD in an XML document, software checks the structure of the XML document against the DTD. This process of checking the structure of the XML document is called validating. The software that performs the task of validating is called a parser.

The following are the two types of parsers:

·        Nonvalidating parser: Checks whether an XML document is well formed. An example of a non validating parser is the expat parser.

·        Validating parser: Checks whether an XML document is well formed and whether it conforms to the DTD that it uses. An XML document that conforms to the DTD is called a valid document.

·        Rules for Creating DTDs 

  When creating a DTD, you need to define all the elements and attributes you'll have in the XML documents. So let's create a DTD for our message XML documents. Some syntax to remember when creating DTDs are the following:

   Symbol
    Meaning
    Example
    ,
 AND
  header (sender, recipient*, date) 
    |
 OR
  message (email | letter) 
   ()
Occurs only Once 
  (email | letter) 
   +
must occur at least once
  (header, subject?, text+)
   ?
occurs either once or not  at all 
   (header, recipient* , date?) 
   *
can occur zero or more
times 
  (sender, recipient*, date)


There are two ways for representing the DTD for the XML structure.

1)    Internal DTD Declaration
2)    External DTD Declaration

An example of a DTD is given as follows:
<!ELEMENT ITEMS (ITEM)+>
<!ELEMENT ITEM (NAME, PRICE)>
<!ELEMENT NAME (#PCDATA)>
<!ELEMENT PRICE (#PCDATA)>

In this example, we have declared four elements, ITEMS, ITEM, NAME, and PRICE. After specifying the element name, you specify the type of content of that element. In case of ITEMS, the content type is (ITEM)+, which means that this element can contain one or more ITEM elements. Similarly, the ITEM element contains the elements NAME and PRICE, which contain character data. This type of data is represented as (#PCDATA).

Note: - DTD files have the extension .dtd.


                                 "Well Formed" vs. Valid


When talking about XML documents, two commonly-used terms are "well formed" and "valid." As in "Is your document marked up in valid and well formed XML?"
Well formed in relation to XML means that it has no syntax, spelling, punctuation, grammar errors, etc. in its markup. These kinds of errors can cause your XML document to not parse.
Note: An XML Parser is software that reads XML documents and interprets or "parses" the code according to the XML standard. A parser is needed to perform actions on XML. For example, a parser would be needed to compare an XML document to a DTD.
<!DOCTYPE root-element [element-declarations]>

1     Internal DTD declaration: -
<?xml version="1.0"?>
<!DOCTYPE note [
 <!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>
The DTD above is interpreted like this:
  • !DOCTYPE note defines that the root element of this document is note
  • !ELEMENT note defines that the note element contains four elements: "to,from,heading,body"
  • !ELEMENT to defines the to element  to be of type "#PCDATA"
  • !ELEMENT from defines the from element to be of type "#PCDATA"
  • !ELEMENT heading defines the heading element to be of type "#PCDATA"
  • !ELEMENT body defines the body element to be of type "#PCDATA"

Just save it and after that execute the XML format. If we want to see the structure if the DTD structure, just go to the view page source option at run time.
External DTD Decoration: -
Immediately following the XML declaration, you would then either link to a DTD or write an internal DTD. While DTDs can be both internal and external, if you are using a DTD for multiple documents, it makes more sense to have the DTD in a separate "external" file. Otherwise, you will have to put the full DTD in the prolog of every XML document, rather than just one line of code.
To link to an external DTD, the declaration goes like this:
<!DOCTYPE RootElementName SYSTEM "DTDfileLocation">
Such As: -
<! DOCTYPE ITEMS SYSTEM "ITEMS.DTD">

SYSTEM in the DTD declaration can be replaced by PUBLIC if the DTD is available via the Internet. You would then need to have a public name for the DTD in the file. For example, the W3Group uses DTDs for the various markup languages they recommend. Here is the recommended DTD for strict XHTML:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">




                                XML schema  

An XML schema provides a way of defining a structure of an XML document. It enables you to describe the elements and attributes that can be present in an XML document. An XML schema is similar to a DTD. However, it can be considered a super set of a DTD in terms of the functionality that it provides. An advantage of using an XML schema is that it enables you to specify the data types for elements.

It enables you to specify whether the element can contain character data or other elements, or whether it is an empty element.
Another difference between an XML schema and a DTD is that an XML schema follows XML syntax. In other words, it is an application of XML, whereas a DTD has its own syntax.

Or in short form, we can say, the purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD.

Why we use Schema: -

These are the key points regarding the schema for better performance

1)       It is easier to describe allowable document content
       It is easier to work with data from a database
       It is easier to define data patterns (data formats)
       It is easier to convert data between different data types
       We can use XML syntax in Schema
       We can transmit the xml schema through XSLT



Structure of XML:-

 <?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="note">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="to" type="xs:string"/>
      <xs:element name="from" type="xs:string"/>
      <xs:element name="heading" type="xs:string"/>
      <xs:element name="body" type="xs:string"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>
</xs:schema>

xsi:schemaLocation="D:\dotnet\website300\ XMLSchema.xsd">

you can use the schemaLocation attribute. This attribute has two values, separated by a space. The first value is the namespace to use. The second value is the location of the XML schema to use for that namespace

Defining a Simple Element


A simple element is an XML element that contains only text. It cannot contain any other elements or attributes
It can be one of the types included in the XML Schema definition (boolean, string, date, etc.), or it can be a custom type that you can define yourself.
The syntax for defining a simple element is:
<xs:element name="xxx" type="yyy"/>
where xxx is the name of the element and yyy is the data type of the element.
XML Schema has a lot of built-in data types. The most common types are:
  • xs:string
  • xs:decimal
  • xs:integer
  • xs:boolean
  • xs:date
  • xs:time

Here are some XML elements:
<lastname>Refsnes</lastname>
<age>36</age>
<dateborn>1970-03-27</dateborn>

And here are the corresponding simple element definitions:
<xs:element name="lastname" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="dateborn" type="xs:date"/>



Extensible Stylesheet Language Transformations (XSL/T)

XSLT is a language for transforming XML documents into XHTML (HTML +XML) documents or to other XML documents. XPath is a language for navigating in XML documents.


XSL stands for EXtensible Stylesheet Language, and is a style sheet language for XML documents.
XSLT stands for XSL Transformations


As discussed earlier, XML does not deal with the presentation of data contained within an XML document. It concentrates only on the structure and the data contained within the structure. This separation of the data and its presentation enables you to display the same data in various formats. However, because an XML document does not contain any formatting instructions for displaying data, you need some special tool that can convert an XML document into a user-viewable format. XSL/T is a W3C specification for formatting XML documents and displaying them in the desired format. XSL/T follows XML syntax XSLT stands for XSL Transformations
  • XSLT is the most important part of XSL
  • XSLT transforms an XML document into another XML document
  • XSLT uses XPath to navigate in XML documents
  • XSLT is a W3C Recommendation






Support for XML in ASP.NET

The growing popularity of XML as a common data interchange format between Web applications has resulted in an increase in the number of software platforms that support XML, and ASP.NET is no exception. ASP.NET enables you to work with XML by supporting a number of XML-related classes. Some of the features provided in ASP.NET for working with XML are as follows:
·         System.Xml namespace
·         XML server-side control
·         Data conversion from a relational to XML format

System.Xml namespace

The System.Xml namespace is a collection of classes that are used to process an XML document. This namespace supports XML-related specifications, such as DTDs, XML schemas, XML namespaces, XML DOM, and XSL/T. Some of the classes present in the System.Xml namespace are as follows:

§ XmlDocument: Represents a complete XML document.
§ XmlDataDocument: Derived from the XmlDocument class and enables
you to store and manipulate XML.
§ XmlElement: Represents a single element from an XML document.
§ XmlAttribute: Represents a single attribute of an element.
§ XmlDocumentType: Represents the DTD used by an XML document.
§ XmlTextReader: Represents a reader that performs a fast, noncached,
forward-only read operation on an XML document.
§ XmlTextWriter: Represents a writer that performs a fast, noncached,
forward-only generation of streams and files that contain XML data.
XML Web server control

The XML Web server control enables you to insert an XML document as a control within a Web Form. The control has the following properties:

·         DocumentSource: Enables you to specify the URL to the XML
           document to be displayed in the Web form

·         TransformSource: Enables you to specify the URL to the XSL/T file,
           which transforms the XML document into a desired format before it is
           displayed in the Web form.

·         Document: Enables you to specify a reference to an object of the
           XMLDocument class

·         Transform: Enables you to specify a reference to an object of the  XMLTransform class

<xsl:template> Element

The match attribute is used to associate a template with an XML element. The match attribute can also be used to define a template for the entire XML document. The value of the match attribute is an XPath expression

<xsl:value-of> Element

The <xsl:value-of> element can be used to extract the value of an XML element and add it to the output stream of the transformation

<xsl:for-each> Element

The XSL <xsl:for-each> element can be used to select every XML element

<xsl:sort> Element


To sort the output, simply add an <xsl:sort> element inside the <xsl:for-each> element in the XSL file


All four properties can be changed programmatically by providing an ID to the XML server-side control. To use the XML server-side control in ASP.NET, you can use the following syntax: -

<asp:xml DocumentSource="XML document" TransformSource=
"XSL/T file" Document="XMLDocument object" Transform="XSLTransform object">

Consider the following XML document:

<?xml version = "1.0"?>
<Products>
<Product>
<ProductId> P001 </ProductId>
<ProductName> Baby Food </ProductName>
<UnitPrice> 2.5 </UnitPrice>
<QtyAvailable> 1200 </QtyAvailable>
</Product>
<Product>
<ProductId> P002 </ProductId>
<ProductName> Chocolate </ProductName>
<UnitPrice> 1.5 </UnitPrice>
<QtyAvailable> 1500 </QtyAvailable>
</Product>
</Products>
You can transform this XML document into a desired format by creating a style sheet. The steps for creating an XSL/T style sheet are as follows:


<?xml version="1.0"?>
<xsl:stylesheet version = "1.0" xmlns:xsl =
"http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<OL>
<xsl:for-each select='Products/Product'>
<LI>
<b> <i>
Product Id :
<xsl:value-of select='ProductId'/> <br />
</i></b>
Product Name :
<xsl:value-of select='ProductName'/> <br />
Unit Price :
<xsl:value-of select='UnitPrice'/> <br />
Quantity On Hand :
<xsl:value-of select='QtyAvailable'/> <br />
<hr />
</LI>
</xsl:for-each>
</OL>
</xsl:template>
</xsl:stylesheet>



You can display the formatted XML document in a Web form by typing the following code
in an ASPX file:
<html>
<body>
<asp:xml id="MyXmlDoc" documentsource="products.xml"
transformsource="products.xsl" runat="server">
</asp:xml>
</body>
</html>




Converting Relational Data to XML Format

ASP.NET enables you to easily convert the data from a database into an XML document. ASP.NET provides the XMLDataDocument class, which enables you to load relational data as well as data from an XML document into a data set. The data loaded in XMLDataDocument can then be manipulated using the W3C Document Object Model.

No comments:

Post a Comment

QUICK REVISION of the Informatics Practices Examination

QUICK REVISION of the Informatics Practices Examination Data Types Every value belongs to a specific data type in Python. Data type iden...