Wednesday, 12 October 2016

Make Over Monday 40 - EU Transport Satisfaction

This week the makeover was on a satisfaction survey on EU Transport. Rather than look at the break down of opinion i looked at the change between 2012 and 2016 and created this arrow plot. I used Red/Green for good and bad changes, but importantly i double encoded this with the arrow heads pointing in the right direction too. 

Tuesday, 11 October 2016

Why are my Tableau server extracts failing?

This is the first in a planned series where I document my adventures in the exciting world of Tableau server. I've been using Tableau server ever since I first found Tableau, but for the first time, I am getting into the admin side of things. I might even end up going on training for it, now that's serious! 

Anyhow one of the first tasks I have been asked to look at is monitoring the server. Now Tableau comes with some dashboards to look at things like user activity, disk space and extracts. These do a pretty good job of saying what's currently going on, but you soon find out that you need a bit more information and that's the point of these posts, to walk through what I needed and how I did it. 

First up, Extracts

We use lots of extracts on our server for a number of reasons. A lot of our data is not from fast warehouses, in a lot of cases, it's from the live databases so we need a way to get the data and make sure that the user interaction in the dashboards is as fast as possible, so that's why we extract. When looking at the extracts I have got 3 tasks that I want to do

  1. Identify and fix all failing extracts, find those one-off errors and put measures in place to stop them happening again. Fix those that are constantly failing due to connection passwords changing etc. 
  2. Once all the failures have been addressed the next step is to look at usage, how many of these extracts are being used, what's the total number of monthly views , what is the size. Those old extracts can be turned off to reduce load and space on the server and remove duplicates
  3. Now we just have the extracts that are needed, they don't fail so the final step is to optimise them, reduce the complexity, increase performance, and tune them up. 
  4. Use the new v10 alerts function to tell users when their extracts have failed so that they can monitor it themselves. Self service FTW!

So the first little issue is over failing extracts. Tableau server comes with a nice little dashboard that shows "How Have Extracts Performed on This Sever" and it lets you select some filters and find extracts that have fallen over or that have taken a long time. The first job I have been asked to do is find out why these are failing and either fix them or at least know what's going on. 

So what I can see is that we have roughly 100 failures every day,now is that good or bad? Are that 100 extracts failing once, or one extract failing 100 times. Are they old or new, are they even being used any more? Like most things related to Tableau, one answer always leads to more questions. There is a separate tab to look at the amount of traffic to a data source, but then that wouldn't list the ones that have failed. As this is a pretty specific query it looks like I'll have to build my own dashboard to assist. And that's never a bad thing, always nice to roll up the sleeves and build something.  

I've connected to the Tableau internal postgresDB and looked at the _backgroundjobs view, which lists all the jobs that have been running, including those extracts that have failed. Then by joining to the _datasources I can build a dashboard that shows me all the jobs for the past month, whether they have passed or failed. I can see the total number of jobs and the number that have failed for the past 7 days, are they going up or down and what proportion fail. This is important as a top level metric, the numbers that the Team Leaders are going to want to see decreasing. 
But crucially, we need to know whether the failing extract was a one-off or if it's always failed or something that has recently fallen over. It's this that's the important thing to look at with any error. Problems like disk space, they might come and go as old data gets pruned but changes to database names might be reoccurring issue. Just getting a list of the current fails doesn't give enough information. I've added the grey bars using reference lines to show today in dark grey and the last 7 days in the lighter grey so that it tied it back to the overview bar charts at the top. This also helps to quickly see if the issues are recent of have been going on for days or weeks. 

Then, of course, the next step is, so this extract is failing, but what is the error, who published it, when did it last correctly run and is it even being used anymore. All this comes from the postgresDB and added to the tooltip

So now I have a single dashboard that in a glance I can see if we have site-wide issues, local project issues, a re-occurring issue that needs intervention, or just a blip that will sort itself out. 

Once I have all the monitoring dashboards in place to check that things are not failing, I can then go on to the next step and look at usage, remove those that are not needed anymore, and then finally tune those that are. 

If these next steps go as well as the first, then I will have made a huge step in getting the server under my command. Muhahahah

Tuesday, 4 October 2016

The Rise and Fall of Global Peace

This week for MakeOverMonday it the Global Peace Index. The original was the classic example of using a map to show changes over time, something that never works very well. When i started to look at the data I saw that the overall global peace score hadn't improved significantly, there were countries , Syria, Libya that had seen large rises. This meant that some countries must have also improved, and so my idea formed. To split the world in two and show both those countries that had declined, but also those that had improved. Once i had that idea, the black and white colour scheme followed. I used the maps as both a navigation tool and as an indication of the division in the world.

Monday, 12 September 2016

MakeOverMonday Week 37 - The Box That Contains The World

This week for Makeovermonday the data is all around global shipping, looking at how many ships and containers are owned and run by the 100 largest container companies. As soon as i saw this data set i knew what i wanted to do. The treemap lent itself perfectly to be displayed as the containers on a ship. All i had to do was a quick google search for a decent image, edit it slightly to make it a transparent overlay and it was done. A little back ground on shipping containers and voila, The Box That Contains The World

Monday, 22 August 2016

The Rise of Malaria in The Democratic Republic of Congo

The Democratic Republic of Congo is the 2nd largest country in Africa. In 2014, a quarter of African malaria deaths occured in the DRC. The 10 countries that surround it equated to another quarter. Since 2006 the surrouding countries malaria victims have been on the decline, by the DRC has continued to rise, peaking at nearly 31,000 in 2013. Civil war prevents medical supplies reaching those that need it. Refugees fleeing conflcit leave behind vital mosquito nets.

This viz looks at the issue. 

Wednesday, 10 August 2016

Celebrate the upcoming release of Tableau 10, get my Udemy course for $10

To celebrate the up coming release of Tableau 10 I'm giving away my Udemy course on Vizzing data with Tableau for just $10, see what i did there. Be sure to follow the link below to get it at the discounted price.

Tuesday, 2 August 2016

How to filter data when you don't have the data?

Filters in Tableau are great, they let you get rid of data that you are not interested in a hone in on the data that you do. They do however have one big flaw. You can only create a filter if you have the data. What do I mean by this? Well lets look at an example. 

We want to create a sales dashboard per state. Our sales areas are divided by regions and we want to create sales dashboards one that just looks at just one region at a time Central,West,East and South.

So we could do that using a filter right? Well lets see what filter options we have. Dragging region onto the filter shelf shows us this

We only have 3 regions to choose from, its not possible to select the West region. The reason for that is that filters Only filter the values in our data. If the data isn't present, we cannot filter it out. We haven't yet had any sales for the West region, so we don't yet have that in our data, so we cannot create a filter on it yet. We do know though, that we will be getting data for it, so how can we set up filtering for data that we haven't got yet?

So what do we do in this instance? Well we can create the East/South/Central dashboard as those three regions are present in the data, but what about the West?

Well we could use and exclude filter and select East/South/Central to be excluded, leaving the other two regions. Then when we do get the sales for West arrive in the data the filter will still work and our dashboard will filter the data. 

But, and its a big but, the sort that Sir Mixalot would like, are we certain no more regions would ever be added? What would happen if a North region got created and data started to be associated with a North region? Well the East/South/West dashboards would be fine, its including the East,West and South regions so any new regions would just be ignored. However, what happens to the West? That filter is only excluding Central,East and South, any other value of Region is welcome. This means that the North region data will be included, silently. Using Exclude filters are only good if you don't care about extra data being added, sometimes that what you want, but its better to include if possible. 

So, theres the problem, how can we create a filter for data that isn't yet in our data source? The answer like most things is parameters. 

We need to create a parameter and then tell Tableau to match it to the Regions and then add that to the filters. 

First, lets create a Region parameter

Now we create a calculated field based on the value of the parameter. 

Add this to the filter shelf

and then test it out.

We can now select a region and the filter works. When we select West, we get a blank sheet, which is correct, there is no data, so nothing is going to be shown. However, when the West's data starts to get added this filter will become "active" and only show the wests data. 

Parameters are a great way to take control of your data viz, you can ensure that filters work how they should and that the only the data that you want to be seen, is seen. 

Thursday, 28 July 2016

40 Years of Executions, 50,000 Wasted Years

This viz was part of MakeOverMonday looking at executions since 1979. I decided to take a look at the wasted years of the executed people, based on their life expectancy according to their race and state. In 40 years over 1400 people were executed, a total of nearly 50,000 wasted years that never were because of the crimes the committed. If the death penalty is the ultimate deterrent, then maybe looking at the data this way might do just that. This viz was inspired by the superb Gun Deaths viz by Periscopic.

Wednesday, 27 July 2016

Don't use network drives with Tableau, use Network Locations

We have users that need to use Excel workbooks as datasources in their Tableau dashboards which are then published to Tableau server. However the problem lies with how Windows and Tableau use the network path names. 
Most people will connect to a shared drive by mapping a network drive, like this

And map the network drive to a letter, in this case Z. 

Now all that works fine and Tableau desktop will happily connect and all is right in the world. But if you then publish that workbook to the server, the problem arises with that drive letter. Tableau server has no idea what that Z drive is, it thinks it's a local drive on the server box somewhere and so it cannot find the correct file. This makes for a sad panda.

Instead we need to make sure that Tableau uses the UNC path name. The UNC path in this example is \\fastnfs\tableau_data_sources, the same that we used when we mapped the drive to the letter Z. 

Even though from here it knows the UNC path..

You can see the path that the data source is using by right clicking on the name in the data window and selecting properties

and if we click on the filename in the data connection window we can see it's defaulted back to use the drive letter. 

To fix this we need to edit that path to be the full UNC one

And then if we publish, Tableau server knows where to go and look for the file, assuming you have giving the Tableau Services user access to the directory.

This all works fine, except it relies on someone remembering to manual edit the path name to the full UNC path and not the driver letter name. 

The good news is that there is an easy solution to this and it's how you set up your access to the shared network drive in the first place. Instead mapping a network drive you map a network location. The process is the same as mapping a drive, except it doesn't get a drive letter. 

Choose the custom network location, click next.

Put in the UNC path for your shared drive

Give it a name

click finish and voila

We now have a new network location, note the different icon. 

Now if we create a data connection and use that instead of a lettered drive

And then check the data source properties we see it's got the correct UNC path, we havent typed a thing, not had to edit anything. 

Hope this helps anyone using files on a network share as a data source, if you need any help setting this up, just give me a shout. 

Thursday, 21 July 2016

Bizare Tableau server behaviour

We were looking at using Excel sheets as a data source with Tableau server. The idea is that we have a central repository that mirrors the Tableau server project structure, one folder per project per reporting team. People can then place their Excel, csv etc into their folder, create a dashboard and publish it to the server. This is going to be very popular once Tableau 10 is released as we will be able to use cross database joins to enable joining MySql databases and Excel sheets into one single datasource. What we would then encourage people to do is not only publish their workbooks but also publish their data sources, so that they can be used by other people to build their own dashboards. 

As part of a test to check that this all works i have come across a very very strange issue. 

I have an Excel workbook that contains a sheet called Title, with two rows of data

This is used to create the title of the Tableau workbook when it gets published. 
I create a simple viz just to check that its able to find the file, ensuring that i used the correct UNC path name.
I then published this to the data server, ensuring i unticked the "External file" checkbox and gave the data source a completely new name. 

I then connected to the newly published data source, and looked at the data from it, and to my surprise, its different to the original 

Now this is very strange, by publishing and then connecting to the new source the data has changed some how. To check this i got someone else to try it on their machine and got the same results. Then I opened up Tableau and used the web editor to connect to the published data source and do the same thing, and guess what, i got a different result

Now it says June 2018, not June 2019, or June 2016 - Original as it should do. 

Very odd i thought, so the next step was to create a local copy of the published data source and see what that did, maybe that would show me where the data in the published datasource was coming from. 

So, the local copy of the published datasource has the correct UNC path, and pulls back the correct information. So the information contained within the published datasource must be correct, else the local copy would be wrong, but its not. 

This means that i no longer trust Excel sources published as a data source outside of a workbook, which is a real bummer as thats one thing we planned on doing. I haven't yet looked to see if this still breaks if we use an extract, but we want to have live connections to the Excel sheets so people can see their updates in real time without having to wait on a schedule. 

Tuesday, 12 July 2016

How Long will Theresa May Be Prime Minister?

On the 13th July Theresa May will become the 13th Prime Minister to serve under Queen Elizabeth II. I thought i would take a look back at the other 12 and see how long they lasted in office and see how long Theresa might last. Seems like less than a year is the low point to beat, and 11 and half years in the record. This is a simple gantt chart using two dates and a datediff calculation. The donut chart was made using the tutorial on Andy Kriebels blog here

Thursday, 7 July 2016

Zens On Tour - A 470 Mile Tableau Odyssey

Wow. Just Wow. What a week, well 4 days it has been. The Zen Master UK Tour is at an end and I am on my final leg of travel back home. Its been an amazing experience and am so glad that I was asked to take part. Its by far the best thing I have done as part of the Zen Program. 

We started out in Edinburgh, my first trip to the city and although we didn’t get to do much sight seeing, the venue offered an amazing view Athur’s Seat. 

The first of the #MakeOverMonday sessions went so well, people got really engaged with the idea and got stuck into the data. Some people were total beginners, their first time with Tableau, but even so were able to create something, within an hour that they could share with the rest of the group. Its amazing that you can get such variety of views from the same simple dataset. Everyone really enjoyed the chance to play with a dataset that was different from their usual one, and I think that is one of the best things about #Makeovermonday. It allows people to play with data, to just experiment, in a safe environment. Nothing that they produce is going to be used to make a big decision, so it can be used as a learning experience, try something new without risk of failure. 

In the afternoon, everyone was treated to yours truly giving a talk on the role of colour in data visualisation. I think they liked it, no one walked out or fell asleep so that’s a result in my book. I was then joined onstage my my fellow Zen Masters for a great Q&A that started with colour but soon went off into other areas. We then had some networking, which was a great way to char to people, hear their stories and try to give back some of the enthusiasm that we feel as Zens. 

We then all jumped on the Zen Bus, our ride for the week. Imagine Iron Maiden Tour Bus, well it was nothing like that, more Spinal Tap. It was a great ride down to Leeds, involving a few board games, magic tricks, conversations, beers and sleep. Even had time for a stop at the Golden Arches. 

We had another great session with people in Leeds, building fun vizzes for #MakeoverMonday, sharing their work and there was a real buzz in the room and hopefully the start of the Leeds TUG. Rob gave a terrific keynote about the challenge of making dataviz mean something to people that don’t see the value in it, how to make people see beyond just pixels on a screen and see the data, and the people behind it. 
We had another great Q&A following the keynote, people engaged with the subject and wanted to know more about how to drive that take up. 

After a wonderful curry and walk though Leeds, and a decent nights sleep it was an early start to get to our next stop Birmingham. 

As for the bus journey to Brum, well lets just say, you weren’t there man, you don’t know what it was like. It was an epic journey, full of thrills, a few spills, some grey hair, swearing, frantic looking up of routes, but eventually, and I don’t know how we made it. 

More makeovers, Q&A and a great talk by Andy on his Dear Data 2 project with Jeff Shafer closed out the day and after a great night watching the football it was off to London, on the train this time. 

We had a great turnout for the London event, in a very swanky hotel. Saw some great makeovers from people with very little experience, but all got something out of it, which was great, new connections made, info passed on. 

Chris Love did a great, and very thought provoking talk on the art of keeping data viz simple. His arguments against overly complex data viz were well made, and I mostly agreed with them. It made me think about some of my earlier dashboards and how I would throw everything at them, and now I don’t. Hopefully he will get to give that talk again as I think its great to hear how reducing the complexity of a viz can be a really powerful design choice. 

And that was that, the end of an epic week. Its been amazing talking to so many people to have so many people come out to see us, to take part in the makeovers, to see the talks. I really hope that this is something that happens again, it was a great way to spread the enthusiasm that we Zens have, and its what we do and why we do it. 

We, have to say a huge thank you to 3 and a half people for all their help in making this dream a reality. 

First Louis Archer, the guy that came up with the crazy idea of putting the band on the road. He played the role of tour manager superbly, kept us on time, told us where to go and organised it so well, it just made the whole thing a total joy. 

2nd Andy Cotgreave, the un-offical leader of the Zens and the Andy of Andy and the Zens. His boundless energy and enthusiasm is what keeps the Zen program the amazing thing it is. Without this, we just wouldn’t be the group of people that we are. Thank you for doing a great job (mostly) of the role of MC for the week, you set each day up a treat.

To Jacob Clarke

(the half I was talking about, as he only did two days) You played the role of roadie so well, and assisted in the Q&A with the microphone, helped out with the makeovers, but more than that, brought the fun to the group. Every successful Tableau event needs one Clarke, and you were ours. Its evident in the fact that it all went downhill when you left us. 

The unsung hero though, was Marcus Wong. He went ahead of us each day, got the venue set-up, put out the badged and made sure that when we arrived, we didn’t have anything to do. 
You know all those awesome Tableau events we go to? You know how well they are run and work so so smoothly? Its because of the backroom guys like Marcus, they are they heroes, so Marcus, thank you, from all of us.

Finally I’d like to thank my fellow Zens, Andy, Craig, Chris and Rob for their friendship and support on this trip. We managed to get through it all without falling out, which says something about the group. All of us took time away from home, from families to do this, we don’t work for Tableau, we don’t get anything from this. 
We do this, cos this is who we are. 

We love playing with data, playing with Tableau. 

We make simple things, complex things, beautiful things, practical things. 

We write blogs, we talk on podcasts, we present. We teach and we learn. 
We will always do our best to help others join us on our Tableau and dataviz journey. 

We do it, because we want to share. We do because we love to help. 

We do it because we are the Tableau Zen Masters.  

And I've never been prouder to be one.