C# Homoiconicity in practice or Rendering Identical Résumé in Word and PDF From a Single Serializable Domain Structure.

TL;DR: Watch video by Mark Seemann about Homoiconicity in C#. Appreciate his sample implementation. Go see my simplistic implementation that is actually is used in production.

Homoiconicity is ability of code to be manipulated as data and data presenting code. I’m not sure I completely understand the concept, so I’m not going to try to explain it here. Honestly, just watch the video by Mark Seemann about Faking Homoiconicity in C#. After the video all what I’m talking here will make much more sense!

In his talk, Mark describes an application that generated loan proposal document that had complex logic and had to output document to a printer. For my project I had to generate a résumé into both Word and PDF formats. I inherited the project, so PDF generation was already done. I only needed to create Word document. Easy said!

Problem was that PDF generation was tightly coupled with PDF rendering engine and was in a bit of a “state”. When this part of project was written the developers did not know any better way to do it. And I did not imagine this can be done in a neater way. Until I saw Mark’s talk.

The way my project was generating PDF documents worked fine, until I needed to produce identical documents into Word format. It would be easier to go from Word and use some converter into PDF, but I did not have that luxury.

Data Structures

Following Mark’s lead, I identified basic parts of what I needed to represent in documents, grouped them into higher level sections. This gave me a domain data structure that represents my document. Then I fed that structure into renderers and they produced me actual files in required formats.

My very basic structures turned out to be Paragraph, Table and Bulleted List. Each of that is a class with IResumeElement marker interface. That interface does not have any methods, it only works as a marker interface, so I can group elements into a collection.

public interface IResumeElement
{
    // represent a small element on a page, like a table or paragraph
}

My Paragraph had properties of text, alignment and font. Table had rows and cells. And Bulleted List consisted of collection of Paragraphs. I’m making things very simple here. My production code has table borders for every cell. Cells have backgrounds, paragraph font can be given colour and many other properties that you’d expect from a proper document. There are 2 reasons I’m omitting extra details – it makes sample code easier to understand; and the original production code is not open source.

My higher level structures in document are Sections: IResumeSection. Section of the document consists of elements:

public interface IResumeSection
{
    IEnumerable<IResumeElement> ProduceElements(ResumeData resumeData);
}

Section is a logical grouping of elements. I have personal details section that presents name, contact details and address. Another example section is “Education” – description of all the education had by the resume owner. Here is a basic

public IEnumerable<IResumeElement> ProduceElements(ResumeData resumeData)
{
    yield return new ResumeParagraph("Sample text for demonstration");
    yield return new ResumeBulletedList()
                 {
                     Paragraphs = new List<ResumeParagraph>()
                                  {
                                      new ResumeParagraph("Bullet Point One"),
                                      new ResumeParagraph("Bullet Point Two"),
                                  }
                 };
}

There is a lot of Composite pattern going on: Resume is composed of sections; ResumeSection is composed of elements. In other words, résumé consists of sections; each section consists of elements.

Talking about composite, I do have a CompositeSection that consists of other sections:

public class CompositeSection : IResumeSection
{
    public List<IResumeSection> Sections { get; set; }

    public IEnumerable<IResumeElement> ProduceElements(ResumeData resumeData)
    {
        var result = new List<IResumeElement>();

        foreach (var section in Sections)
        {
            var sectionElements = section.ProduceElements(resumeData);
            result.AddRange(sectionElements);
        }

        return result;
    }
}

This section is used together with ConditionalSection that is taking Specification class (lookup specification pattern).

public class ConditionalSection : IResumeSection
{
    public IResumeSection TruthSection { get; private set; }
    public IResumeSectionSpecification SectionSpecification { get; private set; }


    public ConditionalSection(IResumeSectionSpecification sectionSpecification, IResumeSection truthSection)
    {
        this.TruthSection = truthSection;
        this.SectionSpecification = sectionSpecification;
    }


    public IEnumerable<IResumeElement> ProduceElements(ResumeData resumeData)
    {
        if (SectionSpecification.IsSatisfiedBy(resumeData))
        {
            return TruthSection.ProduceElements(resumeData);
        }
        return Enumerable.Empty<IResumeElement>();
    }
}

Where specification interface looks like this:

public interface IResumeSectionSpecification
{
    bool IsSatisfiedBy(ResumeData data);
}

Rendering

Because I only have 3 different types of elements, my renderers need to know how to handle only these elements. And it is down to each renderer to correctly translate the domain structure into required format. My renderers take a collection of Sections, extract elements from each section and do render each element separately. RendererBase class looks like this:

public abstract class RendererBase
{
    public abstract MemoryStream CreateDocument(IEnumerable<IResumeSection> resumeSections, ResumeData data);
    protected abstract void RenderParagraph(ResumeParagraph resumeParagraph);
    protected abstract void RenderTable(ResumeTable resumeTable);
    protected abstract void RenderBulletedList(ResumeBulletedList bulletedList);


    protected void RenderElements(IEnumerable<IResumeSection> resumeSections, ResumeData data)
    {
        var elementRenderers = GetElementRenderers();

        foreach (var section in resumeSections)
        {
            var elements = section.ProduceElements(data);

            // do the rendering
            foreach (var element in elements)
            {
                var elementRenderer = elementRenderers[element.GetType()];
                elementRenderer.Invoke(element);
            }
        }
    }


    private Dictionary<Type, Action<IResumeElement>> GetElementRenderers()
    {
        var result = new Dictionary<Type, Action<IResumeElement>>()
         {
             { typeof(ResumeParagraph), (element) => RenderParagraph((ResumeParagraph)element) },
             { typeof(ResumeTable), (element) => RenderTable((ResumeTable)element) },
             { typeof(ResumeBulletedList), (element) => RenderBulletedList((ResumeBulletedList)element) }
         };
        return result;
    }
}

And to be honest, here only one method is required: CreateDocument(). Other methods in the interface only just to make sure that every renderer knows about all the types of elements. Methods RenderParagraph, RenderTable, RenderBulletedList should really be private and never visible to the outside world. This is just an easy way to enforce for each renderer to have an ability to print every possible element. I don’t know other easy way to do that, if you know how to do it, please tell me, I’d like to know!

As Kristian Hellang rightfully suggested, Template pattern is very applicable to the base class of Renderer. This way I make sure that all the renderers know how to deal with all the element types, and also I take out some logic from the concrete renderers, like deciding on which method does render an element.

I’ll spare the details of implementation of rendering – you can check out that in the sample solution on GitHub (see link at the end of the article).

Testability

Every element is very small, every bigger element consists of small elements, but in itself is also small. All of the solution is very testable. I honestly have 100% test coverage on my production classes, even the renderers are covered – there are ways to validate if the PDF and Word documents are generated correctly. Some details about testing OpenXML generation you can find in my previous articles. I’m not including tests into the sample solution because I’m lazy. I’ve already spent 2 days getting together the sample – most of the time killing domain specific stuff and removing a lot of code I don’t want to share. So you’ll have to trust me on this one -)

If you are really interested in testing the renderers, let me know, I’ll try writing about it. Testing of the domain models are very simple and not exciting in any way. You can see how Mark does his testing in his sample solution I trust him to do a good job in TDD!

Homoiconicity

You might ask “where the hell is homoiconicity here?”. It is here:

public static List<IResumeSection> ComposeResume()
{
    return new List<IResumeSection>()
           {
                new PersonalDetailsSection(),
                new EducationSection(),
                new CertificationSection(),
                new EmploymentHistorySection(),

                new ConditionalSection(
                    new CitizenSecretSpecification(), 
                    new MembershipSection()),

                new ConditionalSection(
                    new TopSecretSpecification(), 
                    new CompositeSection()
                    {
                        Sections = new List<IResumeSection>()
                                   {
                                       new TopSecretSection(),
                                       new CitizenSecretSection(),                                                    

                                   }
                    }),
           };
}

Resume is composed of little classes – sections. These classes can be serialized and persisted. Or list of sections can be read from a database, composed into C# objects and then you can feed these into renderers. Or user can drag and drop sections into place, then you build a C# list of IResumeSection and then feed that to a renderer – this will give each user a customised résumé. More than that – you can persist the order of sections in drag’n’drop area and then next time give a customized résumé to a user. Or do many other cool things…

Just for kicks, in my sample solution, I compose a list of sections, serialize it into standard C# binary notation, then de-serialize it; then serialize as JSON with Json.Net, deserialize it and then render these sections into PDF and Word:

public static void Main(string[] args)
{
    //var resumeSections = ResumeComposer.ComposeBasicResume();
    var resumeSections = ResumeComposer.ComposeResumeForTopSecretAgents();

    ResumeSectionsToBinaryFormat(resumeSections, "ResumeSections.cv");

    List<IResumeSection> sectionsFromBinary = ReadResumeSectionsFromBinary("ResumeSections.cv");


    String json = SerializeAsJson(sectionsFromBinary);

    List<IResumeSection> sectionsFromJson = DeserializeJson(json);


    var resumeData = Data.JamesBond;
    CreatePdf(sectionsFromJson, resumeData);

    CreateWord(sectionsFromJson, resumeData);
}

I hope all this makes sense and next time I come across a PDF generation code, it will be nice and testable -)

The sample solution is on GitHub. Have a look through.

p.s. In my other project I’ll need to create an interface to customize résumé format for different users. And drag’n’drop with json persisting is what I’ll be using. Maybe I’ll write about it here…