So I created a cube using my the visit/pageview that I recently started collecting. The cube is VERY simple – only 4 dimensions {App, Page, Source, Time} and just a single measure – {Hit}. I created the cube using the 2005 versions of Visual Studio and Microsoft’s Analysis Services. The IDE’s wizards pretty much walk you through the process, which is great because creating an Analysis Services project from scratch is more than a little intimidating. Especially if your a web developer like me and you don’t know a whole lot about querying, let alone designing a cube.
Anyway, like I said, my cube is very simple. The App dimension only contains 2 members: ‘mattberseth’ for this site, and ‘mattberseth2′ for my live demo site. The Page dimension contains all of the unique URLs for both sites as its members, the Source dimension is essentially a bit field for determining if the traffic was the result of a direct hit or from a referring site and finally the Time dimension represents the calendar and is used for counting hits by a time interval (i.e. Days, Weeks, Months, etc…).
In my simple cube, all 4 dimensions and the Hit measure are currently coming from a single table. I have simulated the standard star schema by generating the keys for the different dimensions using the following 4 SQL queries. So the first query in the SQL snippet populates my measure group and the other three are responsible for populating the App, Source and Page dimensions.
-- builds the fact_pageview measure group ***
select
-- if the referring url is empty than we know the source comes from a direct hit
(case when ref_url = '' then 1 else 0 end) as source_id,
-- the app_id - either mattberseth or mattberseth2
app_id,
-- turn the page url to an int
checksum(url) as page_id,
-- extract the date portion of the datetime
convert(datetime, floor(convert(float, date))) as date,
-- each row is a sinlge hit
1 as hit
from
-- my visit table
visit_load with(nolock)
-- builds the app dimension ***
select
'mattberseth' as app_id,
'My Blog' as app_name
union
select
'mattberseth2',
'My Live Demo Site'
-- builds the source dimension ***
select
0 as source_id,
'Referrer' as source_name
union
select
1,
'Direct'
-- builds the page dimension ***
select
checksum(url) as page_id,
url as url
from
visit_load with(nolock)
Next, I created views for these 4 queries, let my Analysis Services project know about them and used them as the data source for my cube. Conceptually, this diagram shows how these 4 queries are related.
Browsing and Querying the Cube
And amazingly, only after a few minutes of nexting through wizards and drag and drop design work, I deployed and processed the cube to my local Analysis Server instance. And now I can start taking a look at the data.
Browsing the Cube
Once the cube is deployed and processed, you can start browsing it. Below are a couple of screen shots that show the structure of my cube on the left, and hit counts for my two sites (mattberseth.com and mattberseth2.com) segmented by the traffic source (either direct traffic or referring site). The screen shot below that shows these counts as a percentage of the grand total. Looks like direct traffic to my demo site only makes up 2% of my total traffic ;(
Querying the Cube
And if you can stomach writing a little MDX, you can write custom queries to extract even more useful information. Below is a sample MDX query and result set that shows the average traffic per day for both mattberseth.com and mattberseth2.com segmented by week. The numbers are a little deceiving because only Week 39 consists of a full 7 days, but I think you can get the picture.
with
-- define the Weekend and Weekday sets
set [Weekday] as
{
[Time].[Day Of Week].[Day 2],
[Time].[Day Of Week].[Day 3],
[Time].[Day Of Week].[Day 4],
[Time].[Day Of Week].[Day 5],
[Time].[Day Of Week].[Day 6]
}
set [Weekend] as
{
[Time].[Day Of Week].[Day 1],
[Time].[Day Of Week].[Day 7]
}
-- create a few calculated meausres based that make use of these sets
member [Measures].[Weekday Average] as avg([Weekday], [Measures].[Hit]), format_string = '#'
member [Measures].[Weekend Average] as avg([Weekend], [Measures].[Hit]), format_string = '#'
member [Measures].[Weekly Average] as avg({[Weekday], [Weekend]}, [Measures].[Hit]), format_string = '#'
select
{
[Measures].[Weekday Average],
[Measures].[Weekend Average],
[Measures].[Weekly Average]
} on 0,
non empty
{
[App].children * [Time].[Week Of Year].children
} on 1
from
[PageView]
What’s Next?
Well, I am pretty excited. I only have a handful of development hours invested in my visit cube (it honestly took longer to write this post than it did to create the cube) and I can already tell I have made the right decision by maintaining my own pageview/visit database. Of course there is still a lot to do …
- My pageview JavaScript tracking code needs some work. I have been tweaking it over the past 2 weeks to play around with different techniques to keep my tracking request from getting cached. I have finally come within a few percent of what Google Analytics is reporting so I am happy. I think I will clean the handler up and write a quick post describing what I did.
- My Source dimension on has 2 members – Direct and Referrer. I would like to break down the Referrer further to include Search Engines, Community (DNK, Reddit, Digg, DZone, etc…), and forums (forums.asp.net, stackoverflow, etc…).
- I need to extract keywords from the Search Engine sources and get them into the cube
- Look up geography information based on IP
- IP + User Agent Sessionization. I would like to track time on site, navigation paths, etc…
- Incorporate additional dimensional data from my Moveable Type database
At some point I plan on sharing the solution: JavaScript tracking code, HttpHandler, OLTP and OLAP databases as well as the Analysis Services Project …
That’s it. Enjoy!

You have probably already read the great news that Microsoft is going to begin shipping jQuery with Visual Studio and


With or without the cooperation of television networks, your favorite TV shows have moved to the internet in a big way. On Tuesday we asked you to 


Apart from the sites dedicated to streaming content, more and more networks are building their own on-site tools for streaming content from the source. For example, the big three networks all have streaming sites: the
The
Windows/Linux only: The Giveaway of the Day web site is offering up the “Professional Edition 2.0″ of EASEUS Partition Manager, a free tool for formatting, moving, and resizing partitions on a hard drive. In addition to the tools offered in EASEUS’ (
When your brother’s starting up a new business and keeps asking you for the best places to do things online like send faxes, get legal help, or find a virtual assistant, send him a copy of The Ultimate Small Business Owner’s Resource Guide. The book normally costs $35 for a print version, but it’s available as a free PDF download here today. The 102-page volume is a compendium of webapps, engines, indices, software, and online tools for small biz owners looking to get things done cheaply and easily. You can find most of these recommendations online with some creative Googling, but the book offers a quick look-up with a well-organized table of contents. Aside from a few odd mistakes (like calling Google Calendar “Gmail Calendar”), this book’s recommendations align with many you’ve seen here at Lifehacker.
If you prefer sticking to the speed of the command line whenever possible, weblog CatsWhoCode details how to use
Windows only: Free application ClickGone (or One Click App Killer—there's some confusion over the actual name) turns your mouse cursor into a crosshair and forcibly kills the first app your pull the trigger on. Although it's fun to fire up Internet Explorer and kill it for sport, the application is actually intended to kill off frozen apps quickly and easily. Assuming you were to create a keyboard shortcut to run ClickGone, it could serve as a simple go-to for knocking out a stalled application. This single-use utility is freeware, Windows only.