Posts for category 'Programming'

Enumerating the lines of a file

If you've done any programming at all, you'll probably have read a file line by line at some point. Fortunately, most libraries provide an easy way of doing that, and .Net is no exception: the TextReader.ReadLine method provides what you need. If you've used this method, you'll have written something like the following C# code.

using( StreamReader reader = File.OpenText("myfile.txt") )
{
    string line;
    while( (line = reader.ReadLine()) != null )
    {
        // Do something with the line
    }
}

While it does the job, personally I think it would be nicer and more semantic if we could use the foreach keyword for this. It's possible of course to use File.ReadAllLines for this purpose (we can use foreach to enumerate over the array it returns), but that reads the entire file into memory at once, so it's not a good solution if the file you want to read is big.

Fortunately, we can make it possible to do this with very little code indeed, thanks to extension methods (introduced in .Net 3.5) and the yield keyword (introduced in .Net 2.0).

public static class TextReaderExtensions
{
    public static IEnumerable<string> EnumerateLines(this TextReader reader)
    {
        if( reader == null )
            throw new ArgumentNullException("reader");

        string line;
        while( (line = reader.ReadLine()) != null )
        {
            yield return line;
        }
    }
}

Note that the extension is defined on TextReader, so you're not limited to using it with StreamReader; you can also use it with StringReader or anything else that inherits from TextReader.

With these few lines of code, we can now use foreach to enumerate over the lines of a file:

using( StreamReader reader = File.OpenText("myfile.txt") )
{
    foreach( string line in reader.EnumerateLines() )
    {
        // Do something with the line
    }
}

While this isn't much shorter than the original, it looks much nicer in my opinion.

Categories: Programming
Posted on: 2008-09-26 07:25 UTC. Show comments (3)

FormatC source code formatting

I am proud to announce a new utility here on Ookii.org: FormatC.

FormatC is a utility that allows you to add syntax highlighting to your C#, Visual Basic, C++, XML, HTML, Transact-SQL or PowerShell source code, so you can publish it on a web page or blog post.

Why does the world need yet another syntax highlighter? Mainly, because of none of the existing .Net based ones had the features I needed. That's right, FormatC is the utility I've been using to format source code for my own blog. So if you read my site you've already seen many examples, including this one which demonstrates one of those features I mentioned: Visual Basic XML literals. I dare say I'm one of the first to actually support that, although it does have some limitations (which are mentioned on the FormatC page). In fact, I have support for all C# 3.0 and Visual Basic 9.0 features, including Linq.

You can format your source code using the interface on my site and simply copy/paste the results into a webpage or blog post, and customize the highlighting by editing the provided style sheet (or simply keep the default). You can also download FormatC as a class library to use in your own application, or look at the source code. It's designed to be easily extensible, so you can add your own languages if you want.

If you use it, let me know what you think.

Categories: Software, General computing, Programming
Posted on: 2008-09-06 09:46 UTC. Show comments (3)

Win an MSDN subscription

My good friend Christian Liensberger is giving away two MSDN Premium with Visual Studio Team System subscriptions, worth nearly $11,000! To be eligable you need to create a short screencast about a Microsoft technology, e.g. "how to get started with Silverlight 2". The competition runs from July 1st until July 31st. I will be helping with the judging.

See Christian's blog for full details.

Categories: General computing, Programming
Posted on: 2008-06-28 14:22 UTC. Show comments (1)

Creating an RSS feed with XML literals

As I promised, I will now give an example of how to use XML literals in Visual Basic 9 to create an RSS feed.

RSS feeds are an example where XML literals are ideally suited for the task. RSS feeds are commonly automatically generated, and instead of having to deal with XmlWriter or XSLT or something similar we can create it directly in VB with minimal effort. This is not a made-up example; the RSS feed for ookii.org is currently generated using the XmlWriter approach similar to the first example in the previous post. When .Net 3.5 is released and my host installs it on the server, I will replace that code with what you see in this post.

We will use a generic RssFeed class to generate the XML from. This also has the advantage that if you have multiple different data sources you want to generate an RSS feed for you can reuse this code. All you need to do is fill an RssFeed class with the appropriate data (for which LINQ is also ideally suited).

For brevity, I will not list the full source of the RssFeed class and its associated RssItem and RssCategory classes here. Suffice it to say they are classes that contain properties for things such as the title of a feed or the text of an item. RssFeed has a collection of RssItems and RssItem has a collection of RssCategories. If you want to see the definitions check out the full source of the example.

The first thing we need to take care of is XML namespaces. RSS itself doesn’t use a namespace, but we’ll be using some extensions that do. We’ll be using these namespaces in multiple places in the VB source and it’d be nice if we don’t have to repeat the namespace URI every time. Fortunately, VB allows us to import XML namespaces in much the same way as regular .Net namespace so we can use them in any XML literal in the file:

Imports <xmlns:dc="http://purl.org/dc/elements/1.1/">
Imports <xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
Imports <xmlns:wfw="http://wellformedweb.org/CommentAPI/">

Before we get started on the heavy work, we have one more thing to do. If we want this to be applicable generically we must realize that some items do not apply to all feeds. For instance I will be using the <slash:comments /> element which gives the number of comments for an item. Not all types of items can have comments so that element doesn’t always apply. Although we could put the code to omit these elements directly in the XML embedded expressions this doesn’t aid readability, so I’ve opted to create functions for them. Here we make use of the fact that if an embedded expression returns Nothing, it’s ignored.

Private Function CreateCommentCountElement(ByVal commentCount As Integer?) As XElement
    If commentCount Is Nothing Then
        Return Nothing
    Else
        Return <slash:comments><%= commentCount %></slash:comments>
    End If
End Function

Private Function CreateCommentsElement(ByVal commentLink As String) As XElement
    If commentLink Is Nothing Then
        Return Nothing
    Else
        Return <comments><%= commentLink %></comments>
    End If
End Function

Private Shared Function CreateCommentRssElement(ByVal commentRssUrl As String) As XElement
    If commentRssUrl Is Nothing Then
        Return Nothing
    Else
        Return <wfw:commentRss><%= commentRssUrl %></wfw:commentRss>
    End If
End Function

Private Function CreateCategories(ByVal categories As IEnumerable(Of RssCategory)) As IEnumerable(Of XElement)
    If categories Is Nothing Then
        Return Nothing
    Else
        Return From category In categories _
               Select <category domain=<%= category.Domain %>><%= category.Name %></category>
    End If
End Function

Now we can finally get to the meat of this sample, generating the RSS feed, which is exceedingly simple:

Public Function CreateXml() As XDocument
    Dim itemElements = From item In Items _
                       Select <item>
                                  <title><%= item.Title %></title>
                                  <link><%= item.Link %></link>
                                  <guid isPermaLink=<%= item.GuidIsPermalink.ToString().ToLowerInvariant() %>><%= item.Guid %></guid>
                                  <pubDate><%= item.PubDate.ToString("r") %></pubDate>
                                  <dc:creator><%= item.Creator %></dc:creator>
                                  <%= CreateCommentCountElement(item.CommentCount) %>
                                  <%= CreateCommentsElement(item.CommentsLink) %>
                                  <%= CreateCommentRssElement(item.CommentRssUrl) %>
                                  <description><%= New XCData(item.Description) %></description>
                                  <%= CreateCategories(item.Categories) %>
                              </item>

    Return <?xml version="1.0" encoding="utf-8"?>
           <rss version="2.0">
               <channel>
                   <title><%= Title %></title>
                   <link><%= Link %></link>
                   <dc:language><%= Language %></dc:language>
                   <%= itemElements %>
               </channel>
           </rss>
End Function

I do it in two steps, first the items and then the main feed, but it could easily be done in one, I just find this more readable. One thing to note is the way I create a CDATA section. I do it this way because you can't put embedded expresssions inside a CDATA section, as the embedded expression syntax is valid content for a CDATA section. Is that really all there is to it? Yes! It’s that simple.

But wait, there’s more. Remember last time I mentioned how you can also query existing XML documents. This means we can also easily load any existing RSS feed into the RssFeed class:

Public Shared Function FromXml(ByVal feed As XDocument) As RssFeed
    If feed Is Nothing Then
        Throw New ArgumentNullException("feed")
    End If

    Dim result = From channel In feed.<rss>.<channel> _
                 Select New RssFeed() With _
                     { _
                         .Title = channel.<title>.Value, _
                         .Link = channel.<link>.Value, _
                         .Language = channel.<dc:language>.Value, _
                         .Items = From item In channel.<item> _
                                  Select New RssItem() With _
                                     { _
                                          .Title = item.<title>.Value, _
                                          .Link = item.<link>.Value, _
                                          .Guid = item.<guid>.Value, _
                                          .GuidIsPermalink = (item.<guid>.@isPermaLink = "true"), _
                                          .PubDate = Date.Parse(item.<pubDate>.Value), _
                                          .CommentCount = CType(item.<slash:comments>.Value, Integer?), _
                                          .CommentsLink = item.<comments>.Value, _
                                          .CommentRssUrl = item.<wfw:commentRss>.Value, _
                                          .Description = item.<description>.Value, _
                                          .Categories = From category In item.<category> _
                                                        Select New RssCategory() With _
                                                           { _
                                                                .Name = category.Value, _
                                                                .Domain = category.@domain _
                                                           } _
                                      } _
                     }

    Return result.First()
End Function

Here we can also see the nice new object initializers at work. Imagine if you will how much work this would’ve been with XmlReader, and how much harder to read that code would’ve been. And in case you’re wondering if this won’t crash if a feed omits one of the optional elements, it won’t: if a feed omits e.g. <slash:comments>, in that case the item.<slash:comments> query will return an empty list, and the Value property will return Nothing, no exceptions will be thrown.

The full source of the example is available here.

This article was written for Visual Studio 2008 Beta 2. Some of it may not apply to other versions.

Categories: Programming
Posted on: 2007-10-21 10:15 UTC. Show comments (1)

XML literals in Visual Basic 9

With Visual Basic 9 (part of the .Net Framework 3.5 and Visual Studio 2008) Microsoft is introducing a very cool new feature to VB: XML literals. In case you don’t have a clue what I’m talking about, allow me to explain.

With XML literals you can create and manipulate XML documents directly from VB. Previously, creating an XML document meant you had three choices: use an XmlDocument, XmlWriter or manipulate strings containing XML directly (I hope nobody actually did that last one). For manipulating an existing document you are pretty much stuck with XmlDocument, or in limited cases XmlReader.

Although the existing options from System.Xml are powerful, they can lack a bit in the usability area. If you are creating a complex document with XmlDocument or XmlWriter, there’s no way you’re going to be able to tell at a glance what this XML is going to look like from looking at the code.

With .Net 3.5, we get the new classes in System.Xml.Linq such as XDocument and XElement which are already a bit easier to manipulate and, more importantly, which play nice with LINQ. And in VB9, we get an extra layer of sugar-coating with XML literals.

But enough talk, let’s look at some code. Let’s say we have a Book class and a collection of books in a List(Of Books) that we want to save in an XML document. For the sake of the example we assume that XML serialization is not suitable in this case for whatever reason. Here’s how you would do this in Visual Basic 8 (.Net 2.0 and 3.0):

Public Sub CreateBookXml(ByVal books As IList(Of Book), ByVal file As String)
    Using writer As XmlWriter = XmlWriter.Create(file)
        writer.WriteStartDocument()
        writer.WriteStartElement("Books")

        For Each book As Book In books
            writer.WriteStartElement("Book")
            writer.WriteAttributeString("author", book.Author)
            writer.WriteString(book.Title)
            writer.WriteEndElement() ' Book
        Next

        writer.WriteEndElement() ' Books
    End Using
End Sub

This is a simple example, but you can easily see how this would get ugly quick if the document gets more complex, and if you've done anything like this in .Net 2.0 you've probably experienced it yourself. Now let’s see how we can tackle the same problem with LINQ and XML literals in VB9:

Public Sub CreateBookXml(ByVal books As IList(Of Book), ByVal file As String)
    Dim bookElements = From book In books _
                       Select <Book author=<%= book.Author %>>
                                  <%= book.Title %>
                              </Book>

    Dim document = <?xml version="1.0" encoding="utf-8"?>
                   <Books>
                       <%= bookElements %>
                   </Books>

    document.Save(file)
End Sub

There’s two things here I want to call your attention to. The first is the XML embedded expressions. Using the <%= %> syntax, which is similar to ASP/ASP.NET (so a lot of VB programmers are already familiar with it), you can add dynamic content to an XML literal. And it’s not just attribute values and element content that you can specify this way: element or attribute names can be made dynamic in exactly the same way.

The second is the type inference; although I don’t specify the type of either bookElements or document, this code was written with Option Explicit On, so these are not Object variables and there’s no late binding going on. Both variables are strongly typed according to the compiler-inferred type based on the expression used to initialize them (bookElements is in fact an IEnumerable(Of XElement), while document is an XDocument). Visual Studio also tells you this when you hover over the variable names, and you get full IntelliSense support.

Not only is this version shorter (only slightly, but the more complex the XML, the bigger the difference), it’s also a lot easier to see at a glance what the result document is going to look like. And because it’s compiler-checked, it’s a lot less easy to screw up; one missing WriteEndElement in the XmlWriter version and the whole thing goes to hell.

So what about using an existing document? If you want to extract information from an existing XML document in VB8 or before, you can do this while reading with XmlReader, or if it’s already loaded in an XmlDocument you can use XPath or traverse the object tree manually. With XML literals we also get the ability to query an XDocument in a very natural way. For example, if you have an XDocument with the books XML we created above, here’s how you would find the titles of all the books from a certain author:

Dim document = XDocument.Load("books.xml")

Dim books = From book In document.<Books>.<Book> _
            Where book.@author = "Frank Herbert" _
            Select book.Value

This gives us an IEnumerable(Of String) with all the book titles. You see how we could easily access elements and attributes in the document from the LINQ expression. And unlike the XPath approach, this is again checked at compile time (although whether it matches the schema will not be checked, since it doesn’t know that at compile time, but at least the syntax is checked) unlike the equivalent XPath expression "/Books/Book[@author='Frank Herbert']" that would not be interpreted until the SelectNodes function is called at runtime. Note that if you want to check not the children, but the descendants of a node, XML literals also provides that by using two dots: "document..<Book>" selects all the Book elements in the document (equivalent to "//Book" in XPath).

Another advantage over XPath is that you don’t actually have to learn XPath. If you know VB9 and LINQ that’s all you need.

So what about C#? C# developers will be sad to hear that C# 3.0 won’t get XML literals. You do get the new XDocument etc. classes for use with LINQ, though. For reference, here’s what both examples would look like in C#:

// First example.
public static void CreateBookXml(IList<Book> books, string file)
{
    var bookElements = from book in books
                       select new XElement("Book",
                                           new XAttribute("author", book.Author),
                                           book.Title);

    var document = new XDocument(new XDeclaration("1.0", "utf-8", null),
                                 new XElement("Books", bookElements));

    document.Save(file);
}

// second example
var document = XDocument.Load("books.xml");
var books = from book in document.Elements("Books").Elements("Book")
            where book.Attribute("author").Value == "Frank Herbert"
            select book.Value;

As you can see, XDocument does offer some advantages over XmlDocument, especially in conjunction with LINQ, but it’s nowhere near as nice as XML literals in VB.

Since everybody is probably tired of this artificial books example that everybody seems to use, in my next post on XML literals (which is available here) I will show a real-life example where I use XML literals to create an RSS feed.

This post was based on Visual Studio 2008 Beta 2. Some of the information may not apply to other versions.

Read more about the new XML features in Visual Basic 9 on MSDN.

Categories: Programming
Posted on: 2007-10-19 12:13 UTC. Show comments (1)

Latest posts

Categories

Archive

Syndication

RSS Subscribe

;