#Featured #GeoDev

How we run OpenStreetMap service in Azure

03.22.2016

Architecture, trade-offs and cost — what, why and how much.

As a .NET developer I tend to fall into the trap of approaching every software problem thinking how I will do it with C#/ASP.NET/MSSQL. As I am trying to break this habit, StopByStop project was as of yet my most valuable experience that forced me to think outside of the familiar toolbox.

When I first started prototyping the solution to show the highway exits and places of interest around them on the route from A to B for all US destinations — what eventually became StopByStop, I predictably started searching StackOverflow for samples to do “C# maps routing” which only took me so far. I was lucky to work with a really smart developer with OSM and GIS experience who introduced me to PostgreSQL and PostGIS and wrote the initial version of the backend.

Which was a great start, but I still had to improve scalability and host the service. After trying and rejecting one of cheap shared PostGIS hostings, I decided that while the best long-term solution would be to manage our own servers, in the short term Azure with its ability to resize instances was the best way to go.

Linux VM with PostgreSQL

I experimented with various Linux VM sizes on Azure and eventually settled with DS4 for VM and 1GB storage for the database.

The CPU ended up to be the bottleneck for query execution. Since PostgreSQL cannot use multiple cores to execute a single query, searching for exits and places of interest along the long route involved scanning multi-million size indices on several tables. My approach was to break long routes into shorter chunks which resulted in some loss of accuracy, but allowed me to launch up to 8 sub-queries simultaneously for every customer query.

The goal was to execute any uncached query in under 12 seconds on the DB side which may sound as not ambitious enough goal, but given that intended scenario for our service is for local destinations, we are hoping that in most scenarios customers will see their timeline appear much faster.

Azure VM resizing comes in handy for OSM data upload

So while for regular operation DS4 is enough, as I found out that to upload OSM data into database without spending several weeks on it, I need a bigger machine: DS14.

This deserves a separate blog post, but in a nutshell, uploading OSM data to PostgreSQL involves

osm2pgsql that in my case actually doesn’t take that long and there is a lot of helpful tips on optimizing its performance,
osm2pgrouting for which I couldn’t find a lot of performance related suggestions and my question on GIS stackexchange remained unanswered

osm2pgrouting loads the whole OSM file (~ 160GB for North America) in memory, so if most of it ends up in swap space, it will take weeks, even if swap space is on SSD drive. On DS14 with its 112GB of RAM I could complete osm2pgrouting in 3 days.

osm2pgrouting import into #postgresql for all North America complete after just 3 days! on #azure DS14 #osm pic.twitter.com/REw8MRd97d

— Alex Bulankou (@bulankou) March 2, 2016

Premium Storage

What about storage — any way to save there? At first I considered getting something like D3 that has 200GB local SSD, that would be sufficient for North America OSM DB and I wouldn’t need extra storage. However the problem with local SSD is that is irrevocably lost with any hardware failure or when VM is resized. So this was not an option.

When choosing premium storage size I picked P30 (1GB) because anything below that (like P20) would have had less than 200MB/s transfer rate. After trying it out and looking with iostat tool I found that IO rate becomes bottleneck at this point and the price difference is not that significant.

Backup

I would like to be able to easily restore my database with OSM along with my custom indexes and tables on a different machine. This is in case I decide to scale out to process more queries simultaneously and ideally I wouldn’t want to wait 3 days for osm2pgrouting to complete. So this is what I found out worked best for me: backing up to Google drive using odeke-em/driveclient using pg_dump utility.

Neat! #googledrive from #linux shell: https://t.co/q6olYxFc3z by @odeke_et I used: https://t.co/VCOIT2iTcO Will use it for daily db backups.

— Alex Bulankou (@bulankou) March 4, 2016

Frontend — using the right tool for the job?

So given that Linux server was required for PostgreSQL, the cheapest approach for frontend would have been to go with PHP or Node.JS.

Architecture that I DON’t have — I could have had my Linux server also host my web server with Node.JS or PHP

However I have experience in neither of them, so speed things up I went with the sub-optimal, but known path: ASP.NET, which allowed me get things done much faster, but the architecture ended up like this:

My dependency on ASP.NET and MS SQL has introduced 2 extra nodes in the architecture and is costing us extra $80 per month

I’m using ASP.NET on Azure Web App to host my web server and calling PostgreSQL using Npgsql provider. Also because I am addicted to LINQ to SQL (which doesn’t exist for PostgreSQL) and just plain lazy, I have another MSSQL db for non-GIS persistent data. Overall my dependency on Microsoft stack is costing us extra $80 per month. That is ok for now, but going forward we will definitely be switching to hosting web server on Linux box and now ASP.NET Core on Linux may be the best option for this migration.

For Web Application I had our site initially running on B1, but memory is the main bottleneck as I am keeping some of the datasets (like list of all cities, list of all exits, routes for active sessions) in memory. The lesson that I learnt is that when memory pressure is primarily introduced by such pre-cached datasets, scaling up, not scaling out is the right approach.

CDN — a must for customer-facing sites

And finally, this one is a non-brainer and the cost is insubstantial. CDN, along with other optimizations helped to significantly reduce page load time.

Reduced https://t.co/TKShM70np0 PLT from 11 seconds to 6 seconds on slow 3G, thanks to https://t.co/U4jn4K7OOq pic.twitter.com/ffKcmtOX7u

— Alex Bulankou (@bulankou) October 27, 2015

WebPageTest.org is a great tool to for client performance measurement and analysis.

And what’s this StopByStop you keep mentioning?

If you read that far you might be interested in what exactly we are trying to solve with OpenStreetMap and why — please read this and this.

The post originally was published on Medium and has been re-published on Geoawesomeness. You can follow Alex on Twitter @bulankou and on Medium.

Say thanks for this article (0)

Alex Bulankou

1 posts

Want to be an author?

The community is supported by:

Become a sponsor

#Featured

#Business #Featured #Ideas #Science

Harnessing the Power of 30cm Satellite Data for Construction Mega Projects

Aleks Buczkowski 07.30.2023

AWESOME 4

#Featured #GeoDev #People

MapAction looking for volunteers to unlock information management barriers in humanitarian sector

Alex Macbeth 10.31.2023

AWESOME 2

#Contributing Writers #Featured #Fun #Ideas #People

Share Your Insights: Geoawesomeness is Looking for Contributing Writers

Nikita Marwaha Kraetzig 01.18.2024

AWESOME 3

#Ideas #People

How does it feel to be a cartographer in the age of Google Maps? Interview with Mamata Akella

Muthukumar Kumar

03.22.2016

Cartographers are the artists of the GeoGeek world and each map has the signature of the artist who created it. But then in the age of Google Maps, is it hard for the Cartographer to express himself/herself in the same way one could do with paper maps? nacis-2015 Who better to ask than a cartographer who’s worked with Esri, CartoDB and is a recognised face in the cartographic circles – Mamata Akella, senior cartographer at CartoDB.

It doesn’t get more “GeoGeeky” than talking about maps and cartographer with a passionate cartographer. Here’s the summary of what was a really wonderful chat with an awesome Cartographer for the second post in the OffBeat GeoGeek Series. Read on….

How did you get into mapmaking? Or to be more precise, Cartography? 😉

The first time I attempted to do my undergrad, I didn’t succeed because at 18 it was really hard to know what I wanted to do with the rest of my life! I didn’t have a passion, something that really excited me. In between my unsuccessful attempt and my successful attempt, I took a year off to think more about what I wanted to do. At the time, my oldest sister was working at Conservation International in Washington, DC. She told me about a group there that did something called GIS and that it had to do with data, analysis, and maps. I had never heard of such a thing but looked into it and found that University of California, Santa Barbara had a really great Geography program. I didn’t have many college credits so I did two years at Santa Barbara City College where I got introduced to Geography. It wasn’t until I transferred to UC Santa Barbara that I actually made a map.

At UCSB part of the GIS track was Cartography. Some cartographers have been interested in maps their whole lives… that wasn’t me! My first exposure to the field was my intro to cartography class and this is where so many things changed for me. The GIS and Remote Sensing classes were interesting but I LOVED my cartography class. A large part of it had to do with the people who taught me, my cartography professor Dr. Sara Fabrikant and my TA Dr. Kirk Goldsberry. After all these years I had found something that I was passionate about, that excited me, and that I was actually kind of good at! I would sit in the lab for hours and hours designing maps… I realized, ok, this is it… you’ve found it. Over the next two years, I made as many maps as I could, and interned at a variety of places who used maps and GIS in different ways.

After undergrad I went to Penn State where I studied under Dr. Cindy Brewer. She introduced me to many new things (obviously!) but one that really sticks out for me was blending cartography and GIS . Or, data driven cartography. I worked with her during my master’s on multi-scale map design which prepared me for a summer internship at Esri on the same subject. Right after I finished at Penn State (2008) I joined Esri as a Cartographic Product Engineer, building maps for the web and have been doing it ever since.

I’ve also been lucky enough to have many mentors over the years during different parts of my academic and professional career… which has been a huge part of my passion for what I do!

How relevant is Cartography in the age of Google Maps? Have computer scientists taken over the art of map making? How do you think Cartographers have adapted to the digital challenge?

I think cartography is super relevant and that there are so many different people building maps!

Throughout my career I’ve worked at a place with a lot of cartographers to the other end of being the only cartographer. But maybe that’s better put by saying the only person who specifically studied cartography. That has never meant for me that no one else around me is a cartographer. Actually, for me, its quiet the opposite. People that I work/worked with who don’t have formal cartographic training are some of the smartest and most passionate cartographers I know. Sure, they might not get down to the nitty gritty of the actual font, size, and placement but they do get down into the nitty gritty of how to make that possible for people like me. For me, the blending of traditional cartography, computer science, and the web, keeps things exciting… so let’s keep going!

People that I work/worked with who don’t have formal cartographic training are some of the smartest and most passionate cartographers I know.

One thing we can all do as digital map makers is look to the field of cartography for research, best practices, design inspiration, and years and years of experience and infuse that, as much as possible, back into our digital tools and workflows. Cartography isn’t some super secret special society! We are actually some of the most welcoming people and have adapted to change after change throughout the years. Just come to the North American Cartographic Information Society (NACIS) annual meeting and see for yourself!

If you had a magic wand to eradicate one (cartographic) error on digital maps, what would that be?

Piling everything possible on a map just because you can.

You have worked with Esri, the National Park Service and CartoDB and are a well-recognised cartographer. What is your advice for budding GeoGeeks and the next-gen geospatial grads?

Well, that’s nice of you to say! My biggest piece of advice: ask questions, try out new things, don’t be afraid to fail, ask more questions! I’ve been a professional web cartographer for about 9 years now. Still, to this day, I ask questions even ones that some might consider ‘basic’. I went from using ArcGIS to the Mapbox suite, to CartoDB and if I hadn’t asked all the questions that I had to all of the people I have, I don’t know how I would have navigated those pretty radical shifts.

Also, diversify your skill set. If I could go back, I wouldn’t do many things differently, but I would learn more about web development, scripting, and databases… In this age of web cartography, those skills, in addition to design thinking, are critical.

Digital maps unlike paper maps don’t necessarily have to restrict the amount of information, does this make life easier when you need to create a new base map?

Yes, but mostly no. Basemap design is definitely a big cartographic challenge and those who do it should be appreciated! haha…

What you didn’t mention is that paper maps restrict the area being viewed. Sure, the web provides the zoomable map and tools to interact with different components of the data so there is some flexibility in choosing what information to include on the map depending on zoom, etc. BUT the actual design piece is pretty challenging. Think of one map design that has to look good through multiple zoom levels for the entire world! At any one time, basemap designers are juggling 50+ features at a given zoom level and those 50+ features have to work well together no matter what part of the map you zoom or pan to. Urban areas in one country, state, province are so different from each other! And if you design strictly for the urban areas, what happens to the rural ones? I could go on and on about this, but I think that sometimes that piece of basemap design is overlooked!

Think of one map design that has to look good through multiple zoom levels for the entire world!

If you had to pick one map as the most geoawesome one that you ever made. Which one would it be?

I’ve made a lot of maps over the years and at any one time I might have thought that it was the most awesome map I ever made… ha! Lately, my favorite maps are the ones that don’t use Web Mercator!! Recently, I made a dot density map with my co-worker Stuart Lynn. I think this map is super cool for a few different reasons but mostly because of the function that Stuart wrote and the fact that it is a dot density map, on the web, that uses an equal area projection (as it should!!)

https://observatory.cartodb.com/viz/582f22f2-d682-11e5-a3bd-0ecfd53eb7d3/embed_map

Lately, my favorite maps are the ones that don’t use Web Mercator!!

Haha, I can’t disagree with you about Web Mercator (can’t we just use another projection already)! Thanks, Mamata 🙂

Read on