Announcing Pentaho 7 0 available mid November
Announcing Pentaho 7.0 (available mid-November)
I�ll go straight to it � This is the most spectacular release ever!This previous sentence would even be more meaningful if I hadn�t been deeply involved on this release, and by �deeply involved� I actually mean that sometimes I was able to sneak in to the development rooms and a few times speak to a few of the devs before the heads of engineering kicked me out of the room� but still, the janitor sometimes pat me in the back when he saw me crying in a corner and said that someone must listen to me, so I�m taking his word for it�.Anyway, here�s the announcement and mid-november will be available for download!
I�ll go straight to it � This is the most spectacular release ever!
This previous sentence would even be more meaningful if I hadn�t been deeply involved on this release, and by �deeply involved� I actually mean that sometimes I was able to sneak in to the development rooms and a few times speak to a few of the devs before the heads of engineering kicked me out of the room� but still, the janitor sometimes pat me in the back when he saw me crying in a corner and said that someone must listen to me, so I�m taking his word for it�.
Anyway, here�s the announcement and mid-november will be available for download!
The Year of the Product
The beginning of the year, our CEO, Quentin Gallivan, gave us a challenge: �Make this the year of the product!�. In CEO-language, this basically means I�m gonna be fired if we don�t make good progress in a journey to improve usability and ease of use! That�s motivation in my book!So here�s the main announcement of Pentaho 7.0, that will be made available to download mid-November. These are the main release highlightsFigure 1: 7.0 Release HighlightsI�m going through this in a somewhat random order.
The beginning of the year, our CEO, Quentin Gallivan, gave us a challenge: �Make this the year of the product!�. In CEO-language, this basically means I�m gonna be fired if we don�t make good progress in a journey to improve usability and ease of use! That�s motivation in my book!
So here�s the main announcement of Pentaho 7.0, that will be made available to download mid-November. These are the main release highlights
Figure 1: 7.0 Release Highlights
I�m going through this in a somewhat random order.
Admin Simplification
The Pentaho Server
This has been a long term goal internally, and we�ve been testing it in CE since 6.1. The BA Server / DI Server distinction is no more (actually, I don�t make it a secret that I think it shouldn�t ever have been created, but that�s just my sweet person talking�).We now have one single artifact: The Pentaho Server, with full combined BA/DI capabilities. It�s important to notice that this doesn�t change the deployment topology strategy � there will be a lot of times, specially on larger organizations, where it will make sense to have multiple servers, some dedicated to the more interactive, BA style operations and others optimized to the heavy duty data integration work.
This has been a long term goal internally, and we�ve been testing it in CE since 6.1. The BA Server / DI Server distinction is no more (actually, I don�t make it a secret that I think it shouldn�t ever have been created, but that�s just my sweet person talking�).
We now have one single artifact: The Pentaho Server, with full combined BA/DI capabilities. It�s important to notice that this doesn�t change the deployment topology strategy � there will be a lot of times, specially on larger organizations, where it will make sense to have multiple servers, some dedicated to the more interactive, BA style operations and others optimized to the heavy duty data integration work.
A simplified architecture
It�s a fact that our product is architecturally complex; Not because we want � it�s a consequence of us being the only vendor with a platform that works all the way through the data pipeline, from the data integration to the business analytics side.Figure 2: The data pipelineWe�re still faithful to the original founders� vision: Offer a unified platform throughout all these stages, and we�ve been tremendously successful at that. But we believe it�s possible to combine this vision with an improved � and much simplified � user experience. And it�s why we�re doing this.Some of you that have been around long enough that you can recognize this image:Figure 3: Oh my god, my eyes!!!We�re moving to a much simpler (conceptual) approach:Figure 4: Pentaho ArchitectureThis means that going forward, we want to focus our platform on two main cornerstones: PDI and the Pentaho Server. And we�re working on making the two interact as seamlessly as possible.Please note that this doesn�t mean we�re not counting on other areas (Mondrian, PRD, CTools, I�m looking at you), on the contrary. They�ll keep being a fundamental part of our platform, but they will take a more of a backstage role making all the wheels turning instead of a taking a front seat.
It�s a fact that our product is architecturally complex; Not because we want � it�s a consequence of us being the only vendor with a platform that works all the way through the data pipeline, from the data integration to the business analytics side.
Figure 2: The data pipeline
We�re still faithful to the original founders� vision: Offer a unified platform throughout all these stages, and we�ve been tremendously successful at that. But we believe it�s possible to combine this vision with an improved � and much simplified � user experience. And it�s why we�re doing this.
Some of you that have been around long enough that you can recognize this image:
Figure 3: Oh my god, my eyes!!!
We�re moving to a much simpler (conceptual) approach:
Figure 4: Pentaho Architecture
This means that going forward, we want to focus our platform on two main cornerstones: PDI and the Pentaho Server. And we�re working on making the two interact as seamlessly as possible.
Please note that this doesn�t mean we�re not counting on other areas (Mondrian, PRD, CTools, I�m looking at you), on the contrary. They�ll keep being a fundamental part of our platform, but they will take a more of a backstage role making all the wheels turning instead of a taking a front seat.
Connecting PDI to the Pentaho Server
One of the first materializations of this concept was the work done on connecting from the PDI (spoon) to the Pentaho Server. It�s now a much more streamlined experience:Figure 5: Pentaho Repository ConnectionOnce defined, we�ll be able to get a new login experience:Figure 6: Logging in to the Pentaho ServerOnce done, there will be the indication of where we�re connected to, plus a few simpler ways to handle those connections:Figure 7: Identifying the current connectionAnd remember when I mentioned the simplified architecture? Now both the Data Integration user and the Business user have access to the same view:Figure 8: Different views over the same ecosystemA lot of optimizations were done here to allow a smoother experience:- Repository performance optimizations (and we still want to improve the browsing / open / save experience)
- Versioning is turned off by default
- That somewhat annoying commit message every time we save is now also turned off by default
- Every connection dialog now connects to port 8080 and to the pentaho/ webapp instead of the 9080 and pentaho-di that has now been somewhat discontinued (even though for migration purposes we still hand out this artifact)
One of the first materializations of this concept was the work done on connecting from the PDI (spoon) to the Pentaho Server. It�s now a much more streamlined experience:
Figure 5: Pentaho Repository Connection
Once defined, we�ll be able to get a new login experience:
Figure 6: Logging in to the Pentaho Server
Once done, there will be the indication of where we�re connected to, plus a few simpler ways to handle those connections:
Figure 7: Identifying the current connection
And remember when I mentioned the simplified architecture? Now both the Data Integration user and the Business user have access to the same view:
Figure 8: Different views over the same ecosystem
A lot of optimizations were done here to allow a smoother experience:
- Repository performance optimizations (and we still want to improve the browsing / open / save experience)
- Versioning is turned off by default
- That somewhat annoying commit message every time we save is now also turned off by default
- Every connection dialog now connects to port 8080 and to the pentaho/ webapp instead of the 9080 and pentaho-di that has now been somewhat discontinued (even though for migration purposes we still hand out this artifact)
Migration
It�s fundamental to note that existing installations with the BA / DI configuration won�t turn into some kind of legacy scenario; This configuration is still supported and, much on the contrary, it still is the recommended topology. This is about capabilities, not about installation.In 7.0, for migration purposes, we�ll still have the baserver / diserver artifacts for upgrades only.
It�s fundamental to note that existing installations with the BA / DI configuration won�t turn into some kind of legacy scenario; This configuration is still supported and, much on the contrary, it still is the recommended topology. This is about capabilities, not about installation.
In 7.0, for migration purposes, we�ll still have the baserver / diserver artifacts for upgrades only.
Analytics Anywhere
A completely new approach
Ok, so this is absolutely huge! You�re certainly familiar with the classic data pipeline that describes most of the market positioning / product placement:Figure 9: Data pipelineIn this scenario we identify three different funnels: Engineering, Data Preparation and Analytics. But we started thinking about this and got to the somewhat obvious conclusion that this doesn�t actually make a lot of sense. The truth is that the need for Analytics happens anywhere in the data pipeline.By being one of the few products that work on all this 3 areas, we�re in a unique position to completely break this model and deliver analytics anywhere in the data pipeline:Figure 10: Analytics Anywhere in the data pipelineAnd 7.0 is the first step in a journey that aims to break these boundaries while working towards a consolidated UX experience; And the first materialization is bringing analytics to PDI�
Ok, so this is absolutely huge! You�re certainly familiar with the classic data pipeline that describes most of the market positioning / product placement:
Figure 9: Data pipeline
In this scenario we identify three different funnels: Engineering, Data Preparation and Analytics. But we started thinking about this and got to the somewhat obvious conclusion that this doesn�t actually make a lot of sense. The truth is that the need for Analytics happens anywhere in the data pipeline.
By being one of the few products that work on all this 3 areas, we�re in a unique position to completely break this model and deliver analytics anywhere in the data pipeline:
Figure 10: Analytics Anywhere in the data pipeline
And 7.0 is the first step in a journey that aims to break these boundaries while working towards a consolidated UX experience; And the first materialization is bringing analytics to PDI�
An EE feature
This is huge. Really huge! And let me say from the beginning that this feature is EE only. Why? Because according to our CE/EE framework this falls there: it�s not an engine level functionality, and while it doesn�t prevent any work from being done, it drastically accelerates the time to results.And just a word on this � even though I�m the Community guy, and one of the biggest advocates of the advantages of having a great CE release, I�m also a huge proponent that a good, well thought balance has to exist between the CE and EE versions. This balance is never easy to get to � we know we can�t be 100% open source and we know we�ll absolutely lose this battle if we�re completely closed source. The sweet spot is somewhere in the middle.
This is huge. Really huge! And let me say from the beginning that this feature is EE only. Why? Because according to our CE/EE framework this falls there: it�s not an engine level functionality, and while it doesn�t prevent any work from being done, it drastically accelerates the time to results.
And just a word on this � even though I�m the Community guy, and one of the biggest advocates of the advantages of having a great CE release, I�m also a huge proponent that a good, well thought balance has to exist between the CE and EE versions. This balance is never easy to get to � we know we can�t be 100% open source and we know we�ll absolutely lose this battle if we�re completely closed source. The sweet spot is somewhere in the middle.
Entry point
Starting from 7.0, we�ll be able to see a new flyover when in PDI with 2 buttons in there:- Run and inspect data
- Inspect data
Figure 11: Analytics entry pointThe difference between both are subtle but will grow in importance over time; The first option always runs the transformation and get the set of data to inspect, while the second option gets data from cache if it�s available. If not, acts as the first one.
Starting from 7.0, we�ll be able to see a new flyover when in PDI with 2 buttons in there:
- Run and inspect data
- Inspect data
Figure 11: Analytics entry point
The difference between both are subtle but will grow in importance over time; The first option always runs the transformation and get the set of data to inspect, while the second option gets data from cache if it�s available. If not, acts as the first one.
A new Data Inspection experience
If we click any of those options, we should land in a completely new Data Inspection experience:Figure 12: A new Data Inspection experienceThe first thing you�ll see here is obviously the most immediate kind of information you�ll expect to see: A table that shows the data that�s f
get
If we click any of those options, we should land in a completely new Data Inspection experience:
Figure 12: A new Data Inspection experience
The first thing you�ll see here is obviously the most immediate kind of information you�ll expect to see: A table that shows the data that�s f
get