I’m building a prototype for a new project and it was decided to use DocumentDB to store our data. There will be very little data and even less relationship between the data, so document database is a good fit. Also there is a chance for us to use DocumentDB in production.

There is a comprehensive documentation about the structure and how it all ties together. Yet not enough coding samples on how to use attachments. And I struggled a bit to come up with the working solution. So I’ll explain it all here for future generations.

Structure

This diagram is from the documentation

And this is correct, but incomplete. Store this for a moment, I’ll come back to this point later.

Ignore the left three nodes on the diagram, look on Documents and Attachments nodes. This basically shows that if you create a document, it will be available on URI like this:

https://{accountname}.documents.azure.com/dbs/{databaseId}/colls/{collectionId}/docs/{docId}


That’s fine – you call an authenticated request to the correctly formed URI (and authenticated) and you’ll get JSON back as a result.

According to the schema you will also get attachment on this address:

https://{accountname}.documents.azure.com/dbs/{databaseId}/colls/{collectionId}/docs/{docId}/attachments{attachId}


And this is correct. If you do HTTP GET to this address – you’ll get JSON. Something like this:

{
"contentType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"id": "1",
"media": "/media/5VEpAMZpeasdfdfdAAAAAOFDl80B",
"_rid": "5VEpAMZpeasdfdfdAAAAAOFDl80B=",
"_self": "dbs\/5VEpAA==\/colls\/5VEpEWZpsQA=\/docs\/5VEpAMZpeasdfdfdAAAAAOFDl80B==\/attachments\/5VEpAMZpeasdfdfdAAAAAOFDl80B=",
"_etag": "\"0000533e-0000-0000-0000-59079b8a0000\"",
"_ts": 1493673393
}


Turns out that there are 2 ways you can do attachments in DocumentDB – managed and (surpise!) unmanaged. Unmanaged is when you don’t really attach anything, but just provide a link to an external storage. To be honest, I don’t see much sense in doing it that way – why bother with extra resource just to keep external links? It would be much easier to make these links as part of the actual document, so you don’t have to do another call to retrieve them.

Managed attachments is when you actually do store binaries in DocumentDB and this is what I chose to use. And unfortunately had to discover for myself that it is not straight forward.

Managed Attachments

If you noticed in the JSON above there is a line "media": "/media/5VEpAMZpeasdfdfdAAAAAOFDl80B". This is actually the link to the stored binary payload. And you need to query that URI to get the payload. So from knowing document id, you’ll need 2 requests to get your hands on attached binaries:

1. Get list of attachments
2. Every attachment contains link to Media – get that.

So this /media/{mediaId} is missing in the diagram above. Perhaps this is deliberate not to confuse users. I’ll go with that.

Code Samples

Now to the code samples.

I’m using NuGet package provided by Microsoft to do the requests for me:

Install-Package Microsoft.Azure.DocumentDB


private async Task<DocumentClient> GetClientAsync()
{
if (documentClient == null)
{
var endpointUrl = configuration["DocumentDb:EndpointUri"];
var primaryKey = configuration["DocumentDb:PrimaryKey"];

documentClient = new DocumentClient(new Uri(endpointUrl), primaryKey);
await documentClient.OpenAsync();
}

return documentClient;
}


where documentClient is a local variable in the containing class.

Now let’s create a document and attach a binary:

var myDoc = new { id = "42", Name = "Max", City="Aberdeen" }; // this is the document you are trying to save
var attachmentStream = File.OpenRead("c:/Path/To/File.pdf"); // this is the document stream you are attaching

var client = await GetClientAsync();
var createUrl = UriFactory.CreateDocumentCollectionUri(DatabaseName, CollectionName);
Document document = await client.CreateDocumentAsync(createUrl, myDoc);

{
ContentType = "application/pdf", // your application type
Slug = "78", // this is actually attachment ID
});


Now a few things are going on here: I create an anonymous class for sample sake – use strongly typed models. Reading attachment stream from file system – that is also for sample sake; whatever source you have, you’ll need to provide an instance of Stream to upload an attachment.

Now this is worth paying attention to: var createUrl = UriFactory.CreateDocumentCollectionUri(DatabaseName, CollectionName);. UriFactory class is not really a factory in the broad OOP sense – it does not produce other objects that will do actual work. This class gives you a lot of patterns that create URI addressess based on names of things you use. In other words there are a lot of String.Format with templates.

Method UriFactory.CreateDocumentCollectionUri is a going to give you link in format /dbs/{documentId}/colls/{collectionId}/. If you are looking on CreateAttachmentUri it will work with this template: dbs/{dbId}/colls/{collectionId}/docs/{docId}/attachments/{attachmentId}.

Next line with await client.CreateDocumentAsync(createUrl, myDoc) is doing what you think it is doing – creating a document on Azure – no surprises here.

But when you look on block of code with client.CreateAttachmentAsync(), not everything might be clear. document.SelfLink is a URI that links back to the document – it will be in format of dbs/{dbId}/colls/{collectionId}/docs/{docId}. Next big question is Slug – this is actually works as attachment ID. They might as well could’ve called it Id because this is what goes into id field when you look on the storage.

Retrieving Attachments

Once we’ve put something in the storage, some time in the future we’ll have to take it out. Let’s get back our attached file.

var client = await GetClientAsync();
var attachmentUri = UriFactory.CreateAttachmentUri(DatabaseName, CollectionName, docId, attachId);



Here we have some funky things going on again. This part UriFactory.CreateAttachmentUri(DatabaseName, CollectionName, docId, attachId) will give dbs/{dbId}/colls/{collectionId}/docs/{docId}/attachments/{attachmentId}. And GETting to this address will return you JSON same as in the start of the article. Value for attachmentResponse.Resource.MediaLink will look like /media/5VEpAMZpeasdfdfdAAAAAOFDl80B3 and this is the path to GET the actual attached binary – this is what we are doing in await client.ReadMediaAsync(resourceMediaLink). The rest should be self-explanatory.

Conclusion

To be honest, lack of explanation in documentation of this /media/{mediaId} does not add kudos to the team. And I feel like the provided API is not straight-forwrard and not easy to use – I had to decompile and have a wonder about what is actually happening inside of the API library. Also there is too much leakage of the implementation: I really could’ve lived without ever having to know about UriFactory.

Git can do a lot of things, but I’m lazy remembering all the commands I need – some of them are like 5 words long. So I’ll put this here so next time I don’t have to search for them.

Update list of remote branches:

git remote update origin --prune


Delete local and remote branch:

git push origin --delete <branch_name>
git branch -d <branch_name>


Push all branches to remote:

git push --all -u


Push this new branch to remote:

git push origin branchB:branchB


git tag -a v1.4 -m "my version 1.4"


And push tags to remote

git push --follow-tags


I’ve been using CQRS architecture for about 4 years now on projects of different size and life expectancy and today after another refactoring session I would like to share ideas that seemed good when first done, but turned out to be a maintenance nightmare.

I’m on a roll today – second post in the day!

Some time last year I’ve blogged about pushing a NuGet package to VSTS package feed from Cake script. Turns out there is an easier way to do it that does not involve using your own personal access token and storing it in nuget.config file.

Turns out that VSTS exposes OAuth Token to build scripts. You just need to make it available to the scripts:

In you build definition go to Options and tick checkbox “Allow Sctips To Access OAuth Token”:

Then instead of creating a nuget.config in your repository you need to create a new NuGet source on the build agent machine that has this token as a password. And then you can push packages to that feed just by using name of the new feed. Luckily Cake already has all the commands you need to do that:

Task("Publish-Nuget")
.IsDependentOn("Package")
.WithCriteria(() => Context.TFBuild().IsRunningOnVSTS)
.Does(() =>
{
var package = ".\path\to\package.nupkg";

var accessToken = EnvironmentVariable("SYSTEM_ACCESSTOKEN");;

// add the NuGet source into the build agent sources list
{
});

// Push the package.
NuGetPush(package, new NuGetPushSettings
{
Source ="MyFeedName",
ApiKey = "VSTS",
Verbosity = NuGetVerbosity.Detailed,
});
});


I like this a lot better than having to faf-about with personal access token and nuget.config file. Probably the same way you can restore nuget packages from private sources – have not tried it yet.

Nuget Version

If you have noticed I specify url for the feed in format of version 2 – i.e. ending v2. This is because default nuget.exe version provided by VSTS does not yet support v3. Yet packages can take v3. Right now if you try to push to url with “v3” in it you will get error:

System.InvalidOperationException: Failed to process request. 'Method Not Allowed'.
The remote server returned an error: (405) Method Not Allowed.. ---> System.Net.WebException: The remote server returned an error: (405) Method Not Allowed.


So downgrade the url to v2 – as I’ve done int the example above. Most of the time v2 works just fine for pushing packages. But if you really need v3 you can check-in your own copy of nuget.exe and then specify where to find this file like this:

NuGetPush(package, new NuGetPushSettings
{
Source ="AMVSoftware",
ApiKey = "VSTS",
Verbosity = NuGetVerbosity.Detailed,
ToolPath = "./lib/nuget.exe",
});


I keep migrating my build scripts into CakeBuild system. And I keep running them on VSTS. Because mostly VSTS build system is awesome, it is free for small teams and has a lot of good stuff in it.

But working with NuGet on VSTS is for some reason a complete PITA. This time I had trouble with restoring NuGet packages:

'AutoMapper' already has a dependency defined for 'NETStandard.Library'.
An error occurred when executing task 'Restore-NuGet-Packages'.
Error: NuGet: Process returned an error (exit code 1).
System.Exception: Unexpected exit code 1 returned from tool Cake.exe


This is because Automapper is way ahead of times and VSTS uses older version of nuget.exe. If I run the same code locally, I don’t get this error. So I need to provide my own nuget.exe file and rely on that. This is how it is done in Cake script:

Task("Restore-NuGet-Packages")
.Does(() =>
{
var settings = new NuGetRestoreSettings()
{
// VSTS has old version of Nuget.exe and Automapper restore fails because of that
ToolPath = "./lib/nuget.exe",
Verbosity = NuGetVerbosity.Detailed,
};
NuGetRestore(".\MySolution.sln", settings);
});


Note the overriding ToolPath – this is how you can tell Cake to use the specific .exe file for the operation.

Ladies and gentlement, I’m glad to present you NSaga – lightweight saga management framework for .Net. This is something I’ve been working for the last few months and now can happily annonce the first public release. NSaga gives ability to create and manage sagas without having to write any plumbing code yourself.

Saga is a multi-step operation or activity that has persisted state and is operated by messages. Saga defines behaviour and state, but keeps them distinctly separated.

Saga classes are defined by ISaga<TSagaData> interface and take messages. Messages are directed by a SagaMediator. Comes with an internal DI container, but you can use your own. Comes with SQL server persistence, but others will follow shortly.

Basic saga will look like this:

public class ShoppingBasketSaga : ISaga<ShoppingBasketData>,
InitiatedBy<StartShopping>,
{
public Guid CorrelationId { get; set; }
public Dictionary<string, string> Headers { get; set; }

{
this.emailService = emailService;
this.customerRepository = customerRepository;
}

public OperationResult Initiate(StartShopping message)
{
return new OperationResult(); // no errors to report
}

{
{
ProductId = message.ProductId,
ProductName = message.ProductName,
ItemCount = message.ItemCount,
ItemPrice = message.ItemPrice,
});
return new OperationResult(); // no possibility to fail
}

{
if (String.IsNullOrEmpty(customer.Email))
{
return new OperationResult("No email recorded for the customer - unable to send message");
}

try
{
var emailMessage = $"We see your basket is not checked-out. We offer you a 85% discount if you go ahead with the checkout. Please visit https://www.example.com/ShoppingBasket/{CorrelationId}"; emailService.SendEmail(customer.Email, "Checkout not complete", emailMessage); } catch (Exception exception) { return new OperationResult($"Failed to send email: {exception}");
}
return new OperationResult(); // operation successful
}
}


And the saga usage will be

    var correlationId = Guid.NewGuid();

// start the shopping.
mediator.Consume(new StartShopping()
{
CorrelationId = correlationId,
CustomerId = Guid.NewGuid(),
});

{
CorrelationId = correlationId,
ProductId = 1,
ProductName = "Magic Dust",
ItemCount = 42,
ItemPrice = 42.42M,
});


There is some documentation and all hosted on GitHub.

Have a look through samples, add a star to the repository and next time you need a multi-step operation, give it a go!

I’ve been using TFS for most of my professional career. I mean if it is a paid project – it was on TFS. Surely there was a bit of SVN as well, but not too much. In university I did try CVS. For projects where I picked version control, I tried mercurial and git. But anyway, most of the work I’ve done on .Net was with TFS. And I mean TFS was the source control system – never used it for ticketing and task managing, so can’t comment on that. From now on if I say TFS I mean version control system.

Recently we had time to move from TFS to git. I mean we moved that big-ass mahusive project with over 200K lines of code that was in active development for over 5 years. That was hard work to move it and preserve the history. And to be honest my team-mate has done most of the migration himself, I stood by and made sure I don’t interfere.

And I tell you one thing – moving to git was a good decision. Well worth the time/effort investment. So now I’ll try to explain why I’m so happy with git.

TFS is Horrible

If you don’t believe that TFS as a version control is horrible – Google for it. There will be a lot of articles explaining why you should never touch TFS. Probably most of them are true. I’ll list my reasons why I hate TFS:

• Need to be connected all the time. Can’t checkout/modify files if your local connection is down. Or TFS server is down.
• Locks files for read-only (no longer the case in latest TFS, but was in v2010). Makes it almost impossible to change files outside of Visual Studio.
• Projects need to be mapped to folders on your drive and this is very flaky. Many hours was wasted on this quirk.
• Really flaky when trying to work with previous check-ins. One of the things I run into was moving files. You can’t do anything with a file in an old check-in if in later check-ins it was moved/renamed/deleted. I mean you are time machine TFS, I want to meddle with history, I don’t care that this file does not exist 20-check-ins later. Nope.
• SLOW. All the file operations need to go over the network.
• Workspaces. WTF?
• Branching is so bad, it really can be written off the table. More on that later.

Git is Horrible

Same as above, if you Google for this, you will get a number of articles explaining why git is horrible. Most of them are true. Probably. I’ve not tasted most of them, here are my gripes with git:

• Different line endings. It is 2016 for fire-sake! Just bloody ignore them by default. All of you: git, merge tools, editors, etc. I don’t care, I don’t want to know that somebody commited file with different line endings. Just work!
• Complexity of some commands. Remember your first try of interactive rebase?
• Hard to learn. I’m learning a lot of new things every time I get stuck with git.
• Hard to remember commands – I google a lot for stuff I need to do. Can never remember all the options.
• Git is HARD. But there is always git reset --hard HEAD.

So?

I’ve re-read the description above and seems like we’ve replaced QUIRKY with HARD.

I can’t work with quirky – so much wasted time and effort just to overcome the quirkiness.

I can deal with hard – I’ve completed BSc and MSc courses in Computer Science after all – that was HARD. But git has soooooo many books/tutorials/articles now that I always found an answer to my problem within the first page of Google search results.

Branching

No the biggest benefit (apart from escaping from quirky system) of moving from TFS to git is branching. In TFS branching is so bad it is a write-off. It takes forever to do branching, it must be literally in the other folder, you need to restart VS to switch to another branch. If you have IIS pointing to a web-site inside of your solution, you’ll have to reconfigure IIS to switch to another branch. Basically branching is TOO EXPENSIVE in TFS.

And it is not an issue until you used cheap branching in real life.

If you never used git-branded branching, you are used to not doing branching. So if you have any risky development, you tend to be very careful with it, so it does not spoil the rest of the code. You hide it behind a toggle, you don’t really commit it, you comment out your breaking code before checking in. And use a lot of other dirty tricks to not break your existing stuff. That does not always work well and you tend to be shy with potentially breaking changes. This leads to development stagnation because some changes are so global it is impossible to hide them. And you just don’t do them. That leads either to code quality problems or your product suffers from lack of good features.

Also lack of experimental space kills a lot of innovation. I’ve gone through that recently: “hm… what if I add a generic parameter here.. that will actually propagate into a lot of places in the codebase.. nope, can’t afford to possibly break everything in case this does not work out. Not going to try that”. But if there was cheap branching available you can just make a new branch and break stuff all you like. If it does not work out – just abandon the branch – you have tried.

Cheap Branching

Git is all about branches and makes them really cheap to implement/use. In fact, git is pointless without branches – you might as well use SVN instead.

With availability of cheap branches in git, I found that I’m following through with more risky ideas. A lot of them work out and give me good results. If you have a feature that takes a while to implement, you move the development in a branch and don’t disturb your main release – your customers won’t get half-baked code, you won’t need to hide new stuff behind a feature toggle. You merge it back into the mainstream when it is ready – no magic.

This promotes experiments without a risk of them leaking into your production. If experimental code does not work out – it is still there in a branch, just not merged anywhere – you can come back to it later, no need to delete anything. Experiments lead to a better development cycles and in turn gives you a better product.

Conclusion

If git is not making you more money with a resulting better product, it definitely saves you time when you don’t have to maintain a million of toggle for unfinished features.

Dump TFS, move to the dark side, use git.

I see and use badly written software every day. Examples are all over the place. Yesterday I had to login to my pension provider system and they allow to pick your own username, but they HAVE to have it uppercase and prefix 001. Why? Who came up with this crappy restriction?

I’ve seen database structures where information about people was stored in table called _job. And information about jobs was in table _jobs. Just because.

And don’t get me started on home-grown user/password security solutions where you can only have passwords of maximum 8 characters. You see them everywhere.

And plain-text password storage? There is a ton of systems that do that. There is a dedicated shaming site Plain Text Offenders. It’s a poor state of industry when we have such sites.

I can go on forever about shitty software written by clue-less developers or pressed for time/requirements by even more clue-less managers. It is everywhere. But I realise the problem is with us. And we can fix it.

Next time you write any code – think a bit harder. Next time your manager comes up with some requirement that does not make sense – push against it a bit harder. Next time a clue-less developer comes to you (or on Stack Overflow) for advice – try a bit harder to explain the best practices. Next time you start a new tech-stack – read the docs a bit longer. Next time you hit a bug – write a regression unit test. Next time write better software.

I’m a big fan of unit tests. But not everything can and should be unit tested. Also I do love CQRS architecture – it provides awesome separation of read and writes and isolates each read from the next read via query classes. Usually my queries are reading data from a database, though they can use any persistence i.e. files. About 90% of my queries are run against database with Entity Framework or with some micro ORM (currently I’m a fan of PetaPoco, though Dapper is also grand).

And no amount of stubs/mocks or isolating frameworks will help you with testing your database-access layer without actual database. To test your database-related code you need to have a database. Full stop, don’t even argue about this. If you say you can do unit-tests for your db-layer, I say your tests are worth nothing.

A while ago I’ve blogged about integration tests and that article seems to be quite popular – in top 10 by visitors in the last 2 years. So I decided to write an update.