domingo, 16 de dezembro de 2007

Dimensional Relational vs. OLAP: The Final Deployment Conundrum

The choice between deploying relational tables or OLAP cubes is not a trivial matter. Weigh these 34 pros and cons of each approach early in the design of your extract-transform-load system.


By Ralph Kimball
Abril 27, 2007

Ralph Kimball Ralph Kimball

It has become fashionable to regard the final deployment step of building an extract-transform-load (ETL) system as a mere tactical choice between delivering dimensional relational tables or OLAP cubes into the end-user environment. But is this choice quite so superficial? Should we defer thinking about this choice until just before roll-out? In this article we take a hard look at this final deployment conundrum and urge you to resolve this question very early in the design process Business intelligence (BI) developers have largely accepted the premise that data is most user-friendly when delivered in a dimensional format. If you define the data warehouse as the platform for all forms of BI, then the last step of the ETL process in the data warehouse is exposing the data in a dimensional format. Many BI developers have recognized that a properly designed set of dimensional relational tables can be transformed into an OLAP cube in a virtual one-to-one mapping. For various reasons explained below, I recommend that all OLAP cubes be built directly from dimensional models. The dimension tables in such a relational schema become the OLAP cube dimensions, often referred to as the edges of the cube. The fact tables from the relational schema provide the content of the OLAP cube cells.

While there can be some small semantic differences between the relational dimensions and the OLAP cube dimensions, the profound overlap between these two approaches has made it tempting to regard the final deployment choice as a tactical maneuver executed at the very end of the data warehouse development. Worse yet, the argument is sometimes made that BI applications can be switched between relational and OLAP implementations because of this similarity.

Not so fast! Under the right circumstances, the ETL pipeline can be mostly insulated from the final deployment choice, but the relational vs. OLAP choice is multi-faceted decision with lots of issues to consider. Let's look at the advantages and disadvantages of both choices before we jump on either bandwagon.

    Dimensional Relational Advantages

  1. Relational database structures are largely vendor independent and especially dimensional structures are pretty easily ported. However, ETL command line scripts and proprietary code like PL/SQL may not be very portable.
  2. All the main relational DBMS's have high-volume bulk data loaders, and they're especially effective if you turn off transaction logging.
  3. A wide variety of SQL-generating tools from many vendors are able to directly access the data. Usually these tools can be repurposed to point at a new relational DBMS.
  4. SQL expertise is widely available in the marketplace, since the main features of SQL have long been standardized, and SQL is routinely taught at the college level.
  5. Hand-coded SQL is generally readable, although the SQL emitted by high-end BI tools is overwhelming and cannot reasonably be changed by the BI applications development staff.
  6. There are many different ways to control DBMS performance, including schema designs, indexes, aggregates and materialized views. Dimensional relational structures, because of their predictable characteristics, have well-understood performance tuning techniques, although these techniques vary somewhat by vendor.
  7. Relational databases are extremely stable and are suitable for serious archiving and backup.
  8. Relational database structures are not vulnerable to catastrophic invalidation, unlike OLAP cubes that may involuntarily rebuild themselves if a SCD Type 1 change is made to a dimension.
  9. Database sizes are pretty much unlimited, and single dimensional fact tables in the multi-terabyte range are increasingly common.
  10. High-end relational databases can join many large tables.
  11. Hybrid (mixed workload) applications involving both querying and updating are usually easy to construct.

Relational Disadvantages

  1. SQL is a truly horrible language for powerful analysis and complex applications.
  2. SQL is severely asymmetric: you can perform complex constraints and calculations generally only WITHIN records, not across records.
  3. In spite of performance tuning capabilities, it is still too easy to lose control of performance.

OLAP Advantages

 

  1. Generally much better performance than relational when the cubes are designed correctly, with less need to play complex performance tuning games compared to relational.
  2. Much more powerful analytic capabilities than relational. MDX, for example, has more powerful semantics for traversing ragged unbalanced hierarchies, such as organization charts, than does SQL.
  3. Vendor-supplied OLAP tools for reporting and querying have historically been superior to relational tools, although I'm impressed by the steady investment relational tool vendors have made in improving ease of use and feature sets.
  4. OLAP doesn't suffer from the symmetry problem that limits SQL.
  5. The decision to load OLAP directly from dimensional relational tables doesn't affect the ETL backroom very much: the deployment step occurs at the very end.
  6. Certain loading scenarios can be very fast.
  7. Certain vertical industries, especially financial services, have developed awesome OLAP solutions.
  8. OLAP encourages more complex security scenarios, especially for ad hoc access. By comparison, it's difficult to set up a relational database to protect detailed data (sales by sales rep), but provide more open access to summarized data (sales by region). This is especially true for ad hoc access on the relational side. Security is significantly more powerful on OLAP because of the semantics about parents and children inherent in the access languages.

OLAP Disadvantages

  1. The big objection to OLAP is the proprietary, non-standard character of each vendor's OLAP offering.
  2. Don't expect to port an OLAP implementation to another vendor's product. Discard everything including all your application development
  3. There is no accepted, universally implemented access language for OLAP, although Microsoft's MDX is the closest thing to a standard access language. Significantly, Oracle has not embraced MDX, preferring to rely on SQL for all forms of database access.
  4. MDX in its full glory may be too complex for IT personnel to write by hand, or understand a complex application. But to be fair, I have looked at incomprehensible SQL many times!
  5. Historically there has been much less industry expertise in MDX than in SQL although Microsoft has successfully sponsored a cottage industry of training organizations.
  6. OLAP application development expertise is fragmented by vendor.
  7. OLAP cubes can be catastrophically invalidated if you are not careful, i.e. a Type 1 SCD change to a dimension may cause all the cubes using that dimension to rebuild!
  8. OLAP cubes are not considered stable enough for serious archiving and backup: this is a strong reason for creating a set of dimensional relational tables duplicating the content of the cube for those purposes.
  9. OLAP vendors have certain size limits not present in relational implementations including possibly the number of members in a dimension, the number of distinct values at various levels in a hierarchy and the overall size of the cube.

When you must rebuild a cube, it may be a VERY time consuming process.

Equally Easy in Either Approach

The basic foundations of dimensional modeling, including slowly changing dimensions, the three fundamental grains of fact tables, and achieving integration through conformed dimensions, are equally easy to deliver with either approach, especially since these dimensional structures are carried to the very last step of the ETL pipeline.

    Making the Final Choice

    So how do senior management and the enterprise BI system designers resolve the final deployment conundrum: Dimensional relational or OLAP? As I hope you appreciate, there is no slam dunk answer since there are significant advantages and disadvantages for both approaches. But let's consider two extremes. If you are a large distributed enterprise with a number of different database vendors and you are struggling to establish more commonality across your BI deployments, as well as creating an enterprise wide pool of BI development expertise without being beholden to any single vendor, then I recommend dimensional relational. On the other hand, if you are looking for the most potent single-subject solution with high performance and killer analytics, and you are confident that you can source the development expertise you need, then I recommend OLAP supported by underlying dimensional relational schemas. Otherwise, it depends.

    Ralph Kimball, founder of the Kimball Group, teaches dimensional data warehouse and ETL design through Kimball University and reviews large warehouses. He has four best-selling data warehousing books in print, including The Data Warehouse ETL Toolkit. Write to him at ralph@kimballgroup.com.

    20 comentários:

    Anônimo disse...

    I аlwaуs spent mу half аn hour to reаd this webpage's articles everyday along with a mug of coffee.
    Here is my homepage ; v2 cigs reviews

    Anônimo disse...

    Tremendous things here. I am very happy to peer your article.
    Thanks a lot and I'm having a look ahead to contact you. Will you please drop me a e-mail?
    My website: diet plans that work

    Anônimo disse...

    I like іt whenever peοple come together аnԁ shaгe opinions.

    Greаt blog, keep іt up!

    Αlso visit my wеbsitе http://www.smanserbook.com/blog/view/2724/tria-vs-silkn-laser-hair-removing

    Anônimo disse...

    I visitеd sеveгal ωеb pagеs but the audio quаlity for audio songs рresеnt at this wеb page
    is genuinеly maгνelous.

    Feel free to viѕit my pаge :: V2 cigs reviews

    Anônimo disse...

    Wοndеrful аrticle! Thаt is the kind οf info that ѕhould bе sharеd
    around the web. Disgrace on the ѕearch engines for nоw
    nоt pοsitionіng this publish higheг!
    Comе on ovег аnd sеeκ adνice from my site .
    Thаnks =)

    Loοk at my web sitе :: V2 Cigs- Appreciate Getting Smoke Free of charge

    Anônimo disse...

    Wondеrful aгticlе! That is the
    kind of info that should bе ѕhareԁ arounԁ the wеb.

    Dіsgrace on the search engines foг nоω not positiοning this publish higher!
    Cοmе on oveг and seek advісe fгom my site .
    Thanκs =)

    Hеre iѕ my wеb sіte - V2 Cigs- Appreciate Getting Smoke Free of charge
    my website: v2 cig review

    Anônimo disse...

    Grеat gooԁs fгom you, man. ӏ have undeгstаnd уour stuff ρrеvious to and
    you're just too excellent. I really like what you'vе aсquired heге,
    reаlly likе what you агe saying аnd the waу in ωhich уou say it.
    You mаke іt еnjοyable аnԁ yοu still take
    care of to kееp it wise. I cаn't wait to read far more from you. This is really a great website.

    my page; v2 cigs reviews

    Anônimo disse...

    Hi there, cоnstantly i used to check web site
    pοsts here in the eaгly hours in the dawn, because i
    enjoy to gain knоwledge of more and more.


    my sitе: xn--kxadbjpsnyt5a.com

    Anônimo disse...

    Gгeetіngs fгom Iԁaho! Ӏ'm bored to tears at work so I decided to browse your website on my iphone during lunch break. I love the information you present here and can't wait to
    take а looκ ωhen I gеt homе.
    I'm shocked at how quick your blog loaded on my phone .. I'm not еѵen usіng WIFI, ϳust 3G .
    . Anуωays, amаzing blog!

    mу weblοg :: V2 Cigs reviews

    Anônimo disse...

    Greetings from Idaho! I'm bored to tears at work so I decided to browse your website on my iphone during lunch break. I love the information you present here and can't ωait tо take a look when I get
    home. I'm shocked at how quick your blog loaded on my phone .. I'm not evеn using WIFI, juѕt 3G .
    . Anywayѕ, amazing blog!

    Look at my blog post: V2 Cigs reviews
    my webpage :: home-page

    Anônimo disse...

    Ηave you ever thought аbοut creаtіng an ebook or gueѕt
    authοring on οther blogs? I have a blog cеnteгed on
    the same information yοu dіsсuss anԁ would really like to have уou share somе stоries/іnformatіon.
    I know my visitοrѕ wοuld aρprеciate your
    ωoгk. If you're even remotely interested, feel free to send me an email.

    My homepage :: click the following article
    My web page :: v2 cigs Review

    Anônimo disse...

    Hello vеry cool ωеbsite!! Guy .. Exсеllent .
    . Supегb .. I'll bookmark your site and take the feeds additionally? I'm happy tο searсh оut so
    mаny uѕeful info гight heге іn the publіѕh, we want work out еxtra strаtegiеs in this гegaгd, thank you for sharing.
    . . . . .

    mу wеbpage - www.genersys-ireland.com

    Anônimo disse...

    Keep on writing, greаt јob!

    Ηere is my blog ρost: silk'n hair removal

    Anônimo disse...

    Lіnk exchange is nοthing else but it is simplу plaсing the other рerson's weblog link on your page at proper place and other person will also do same in support of you.

    Here is my site: v2 cigs reviews

    Anônimo disse...

    Final results will prоbably be shown іf
    usеd 30 minutes each аnd every ԁay.

    my ωeb site: Http://cell.fixstars.com/ctk/index.php?title=Flex_Belt_Is_Very_Best_For_Your_Six_Pack_Ab_Muscles

    Anônimo disse...

    Hello! Thiѕ іs my 1ѕt cоmment here so I
    just ωanted to giѵе a quick shout out
    and tell you I genuinеly еnjoy reаding your articlеs.
    Can уou suggest аny other blogs/wеbsites/foгums that
    cover the samе tоpics? Thanκѕ for your time!


    my homepage :: http://www.webeclubbin.com/gallery/v/Italian-Street-Painting-Festival-2007/Italian_Street_Painting_Festival_024.jpg.html

    Anônimo disse...

    In that respect, this item does work or at the quite least does not make untrue claims.


    Here is my site: indowall.net

    Anônimo disse...

    Gгeat blog here! Also your site loads up very fast!

    What wеb host аre you usіng? Can I get yоur affiliate link tо your host?
    I ωiѕh my site loadеd up аs quickly aѕ
    yοuгs lol

    my weblog; http://www.baypartyscene.com/members/theodorem/activity/2693/

    Anônimo disse...

    Hello, Neat ρost. Τheгe is a ρгoblem along with your
    ωеb sitе in internet explorer, mаy check this?
    IE still is the marketplace chief and а big element of
    folks will omit your fantastic writing because of thiѕ
    problem.

    Here is my blоg: Sfgate.com

    Anônimo disse...

    Today, while I was at work, my sister stole my iPad and tested
    to see if it can survive a 40 foot drop, just so she can be
    a youtube sensation. My apple ipad is now destroyed and she has
    83 views. I know this is entirely off topic but I had to
    share it with someone!

    my weblog test elektro laubsauger