Hi all, this discussion came up in this thread and I thought that it might deserve it's own thread. The entire idea revolves around attempting to expose the primary key of the attribute view from the fact table (data foundation) rather than the attribute view itself. Is this possible and does anyone have any practical experience with this? If yes, what is the behavior like in the front end tools?
Add two instances of the same field, one for use in join only
Connect the fact to the attribute view join field.
Hide the attribute view field, and only show the fact table field
Essentially the aim is to leverage the value of the referential join in an analytic view, only executing the join when it's absolutely needed. By definition, the referential join behaves as a left outer join in case no fields from the right table are selected, otherwise it's an inner join.
In most traditional DW implementations, there is the use of a surrogate key in order to map the attribute to the fact - therefore to get the natural key one would HAVE to execute the join. For example, to see the MATNR natural key on a given fact table, one would have to execute the join on a surrogate key num and retrieve the natural key, SQL like below.
SELECT B.MATNR, A.SUM(SALES) FROM
FACT_TABLE A INNER JOIN MATERIAL_TABLE B
ON A.MAT_KEY_NUM = B.MAT_KEY_NUM
Now in HANA, we are making all joins on a natural key - so on the fact table there is no surrogate key for MATNR, only the actual natural key values. So keeping the above thread and discussion in mind, the thought is that the referential join could be leverage here for scenarios where queries are being executed that only pull the natural key of a given dimension and no related attributes (from the right table). I see this as the normal execution path for a huge portion of use cases, where a user only wants to execute by MATNR (with no text or other attributes) and execute a SUM operation on the related fact measures. Here it seems that modeling the natural key on the fact to be exposed instead of the attribute would be extremely advantageous.
If fields from the right table (attribute view) are NOT selected, then the SQL would be similar to this (logical only - I'm sure HANA builds it quite differently )
SELECT MATNR, SUM(SALES) FROM
FACT_TABLE
If fields from the right table (attribute view) ARE selected, then the SQL would be similar to this (logical only - I'm sure HANA builds it quite differently )
SELECT B.MATNR, A.SUM(SALES) FROM
FACT_TABLE A INNER JOIN MATERIAL_TABLE B
ON A.MAT_KEY_NUM = B.MAT_KEY_NUM
So I am imagine scenarios where a user is selecting 5 natural keys (with no other attributes) and performing a SUM operation on a fact, in this case it would be possible to eliminate ALL joins and keep all operations within the fact itself.
Any thoughts?
Thanks,
Justin