IIUG World 2019

IIUG 2020 Online World Conference Presentations will begin soon!

Monday, October 18, 2010

Fear the Panther

So, after much waiting and absolutely no pre-event hoopla, Panther is finally here.  You will see much in the press and in the IBM official announcements about the features of Panther (Informix Dynamic Server version 11.70) that IBM thinks are important to the market place. Jerry Keesee is calling Panther "The Last of the Big Cats" hinting that IBM will be finding another theme for the next release.

I don't want to repeat all of what you will hear from IBM, but I do want to provide my own announcement of those features that I feel will be the most important in the market place over the next 12 months.  This is a tremendous release as far as features are concerned.  Without further ado, here's my list, categorized as IBM has done theirs:

OLTP Data Security

What do I mean by OLTP Data Security?  Well, what is the greatest fear in an OLTP environment?  To me it is loss of transactional data.  If a user has completed a transaction and thinks it is secured in your database, you cannot afford to lose it.  If that transaction was a $20 ATM withdrawal, your company is on the hook for the money.  If it was a multi-million dollar electronic funds transfer, even more so.

If a user is in the midst of a transaction and your primary server crashes, do you tell the users "Sorry.  System problem.  Please try again later."  You will lose a customer, or several hundred customers.

"OK.", you ask, "How can Panther help?"  Panther has a new feature that IBM is calling "transaction survival".  I call it "uninterruptible transactions".  If your primary MACH11 server goes down any transactions begun by a user connected to any secondary server (HDR, SDS, or RSS) will not be lost but will be picked up by the surviving SDS secondary server which is promoted to be the new primary server and the transaction will continue unharmed.  There is no application change or ONCONFIG parameter needed to enable this feature and make it work.  Your existing applications don't have to be modified to reconnect to the new primary server.  This is all automatic and effort-free.  No other RDBMS server in the industry has this feature.  Oracle does have a similar feature but it's not real and it's not automatic.  You have to enable it in your application because it is actually implemented in the front-end libraries which fake survival by recording the transaction locally on the client in a flat file and replay the transaction from scratch after reconnecting to the new server.  The feature is so lame that Oracle doesn't even talk about it.

To me transaction survival is a biggie!  The biggest feature in Panther.

OLTP Performance

First, ADTC's testing shows that Panther is about 11% faster than 11.50xC7 without taking advantage of any of the performance enhancing features mentioned below.  However, Panther has several features to further improve performance of OLTP systems. 

One is the removal of the requirement for a foreign key constraint to have an index on the foreign key.  The supporting index can now be optionally disabled at constraint create time (or later) and its storage will be released.  The selectivity of these indexes on the foreign keys pointing to lookup tables with few rows is low and the utility of requiring the index has always been questionable.  The engine does not need these indexes to enforce the constraints unless cascading deletes are enabled.  You likely don't need them for searching because most systems have composite keys containing those foreign key columns that are used for searching and filtering.  It is rare in an OLTP style query for the lookup tables to be selected as the primary query tables requiring an index on the dependent table to support the join.  Removing this requirement will reduce the overhead of inserting and modifying data in tables with many code columns without having any impact on query performance.

Another one is the new Forest of Trees indexes.  This is a new indexing method that combines many of the advantages of a Hash index with those of a traditional Informix B+tree index.  For indexes on keys who's first column(s) have low selectivity but are still important to the correct processing of many OLTP queries (think geo, country, region of an index supporting a company's geographic reporting structure as an example) the b+tree indexes can become rather deep with relatively few unique elements on a level.  What Forest of Trees indexes do is to let the Database Architect specify one or more of the leading columns to be used to create a hash key.  Then a separate b+tree is created for each hash value containing only the remaining key parts following the hash columns.  This results in several flatter b+tree indexes.  You can't perform range scans on the hash columns (for that you will also need a pure b+tree index) but you can do so on the remaining columns in the index key.

Multiple index scans in the optimizer is another big win for OLTP.  Often you have a complex join between three or four tables using the filtered rows from the independent tables to act as filters on the dependent table and several filters on multiple columns in the dependent table.  The optimizer in earlier releases of IDS had to select the "best index" often resulting the engine having to read many rows of data and perform the final filters using the actual row data even though indexes on the filter columns are available.  The old XPS optimizer could use more than one index on each table in a query.  Now Panther can as well.  This can reduce the number of long key compound indexes which can also improve insert, delete, and update performance.  For databases that are delivered as part of a third party application where you can't control the schema, this one may be a HUGE win since the composite indexes that could make such queries reasonably efficient may never have existed in the first place.

Data Warehouse Performance

New Star Schema  support in the optimizer.  What does this one mean?  This is another offshoot of XPS and the multi-index support above.  In Panther, when you have many dimension tables, you can create a single compound index on all of the dimension table keys in the fact table.  The optimizer will realize this and use the filters on the dimension tables to generate a list of valid combinations of the dimension tables' keys into a temp table and join that temp table to the fact table using the compound index to quickly locate the rows that satisfy the criteria on the dimension tables.  Whoosh!  At ADTC we've tested this one quite a bit and the results are impressive.

Administration

I put the new Heterogeneous Server Grid feature under this heading because I can't think of a better one.  IBM is actually making quite a bit of noise about this one and rightfully so.  If you have lots of old hardware around, don't throw it out.  Don't sell it to a junker or to a used equipment reseller.  Configure it as an IDS grid node and add to your company server farm's net computing power at near zero cost (since the boxes are already paid for and amortized).  I'm waiting for confirmation about whether this means you can have an HDR or other MACH11 secondary that's build from different hardware or different OS.  Is suspect not, but who knows what the future may bring.  Heterogeneous grid was a big request from a number of very large Informix customers, so if Heterogeneous MACH11 members are what you need, speak up.

Zero downtime rolling upgrades.  Speaks for itself.  Jarrod down in New Zealand is jumping for joy over this one I'm sure.  He and his one assistant have to upgrade a very large number of servers during the few weeks a year that his company's servers can afford to be offline even for a few hours.  Being able to configure their server grid to be able to do this anytime to roll in a fixpack that removes a critical bug will be a god send for him and many of us out there.

The IIUG will soon be posting its new Features Survey which we will then hand to IBM to help them design the next release of Informix Dynamic Server.  If you don't think that you have a voice in how IBM determines features for Informix, think again.  Every one of the features mentioned here was requested by users like you/me/us either directly or through an expressed need that had to be filled.  Several are features that were asked for and received high marks on the previous Features Surveys.  So, when you see the Insider or email announcement of the Survey, fill it out. 


In the next release I'd like to see:

  • Transaction survival when the server your session is connected to crashes!
  • Bitmap indexes with multiple bitmaps used to generate a net bitmap of satisfying rows.

What would you like to see?  Comments welcome.

13 comments:

  1. Transactional survival would be nice, but I don't think it will happen. You'd have to make something like client side logging which inefficient and very complex. And then the "continue" could be very slow and error prone.

    Anyway, I would love to have simple HDR upgrade without re-initialization (I can't see the difficulty in doing it), maybe workload capture and apply, heterogeneous HDR (I don't think Madison wants this), DDL force (kick out other sessions), locks per session (and other things per session)...
    This would be a good topic for the mailing list...

    ReplyDelete
  2. I agree with the simple HDR upgrade without re-initialization. The last discussion about this was maybe the CONVERSION_GUARD could include enough information and apply to the secondary servers without having to do a restore. The conversion guard is capturing the page changes and would be the same thing the secondary would need for the upgrade. The secondary could be shut down, new version installed, copy the conversion guard files to the secondary. Probably some type of flag or option for oninit to read the files when it starts the new version. You would still have to bring down the HDR/RSS servers, but would not have to transfer an entire backup. The problem we have is our RSS servers are on the other side of the earth. It takes two weeks for customs and shipping to get the backup tape to the data center. 5TB of data across the WAN would take a too long and we would have to pay for the additional band width.

    ReplyDelete
  3. Way to promote the features survey Art!! oh, and Jarrod is how you spell our kiwi friend's name. Great write up.

    ReplyDelete
  4. @kernoal: I'm not sure how CONVERSION_GUARD would help. It saves the BEFORE images of the pages. I think (although I may be wrong) that every activity done during an inplace upgrade is logged. IF(!) this is true I think it would be terribly easy (at least compared to some of the Panther features) to make the secondary servers follow this: The oninit would need to check that it's running a different version (as it already does), then if it's a secondary it would check the logs, and it would need a "START conversion TO version X" entry to start with. Even a customer with "near by" secondaries feels your problem.
    An inplace upgrade is typically possible to do in minutes. And most businesses can "hang" for a few minutes once in a while. It's much harder to explain a larger downtime of the secondaries when they're used for DSS for example.

    P.S.: I'd like to see this topic in the mailing list...

    ReplyDelete
  5. Art, Rolling upgrades sound wonderful, especially with our complex environment. I'll be on the look out for any information I can get my hands on at IOD about it.. See you next week Art.

    ReplyDelete
  6. Fernando, transaction survival works now in Panther for transactions started on a secondary if the primary goes down. I want it to also work when the server you are connected to goes down but the primary stays up so that when the client reconnects to another secondary or to the primary it can pick up the transaction in progress and continue. Since the primary is actually processing the transaction, most of what's needed is already there. Reconnecting and picking up the transaction shouldn't be hard, there just needs to be a transaction handle that the front-end library gets from the BEGIN WORK that it can pass to the server that it reconnects to. The problem isn't the transactional data, but the context data for the transaction data from queries and open cursors. That will be harder.

    The last piece will be surviving when you are connected to the primary and a transaction fails. You'd probably need to have only an SDS secondary take over, but...

    ReplyDelete
  7. Hi,

    But I'm still waiting for materialized views like Oracle has.

    ReplyDelete
  8. Jakub,

    Honestly, I've never seen the long term value in materialized views. For OLTP systems they don't make sense because the data in the view is always lagging the actual data in the underlying table(s). For Data Warehouse and DSS systems they are mostly used for summary tables. You can get the identical functionality using a real table and triggers on the tables that its data depends upon to maintain the data completely up-to-date.

    On the WIKI page that some Oracle wog wrote about materialized views they are also used to make a local copy of remote data. To me, that's what triggers and Enterprise Replication are for and they do the job more efficiently. The WIKI also notes that you can create an index on a materialized view but not on a transient view. Here I'm suggesting using a real table so you can also use indexes if you do it my way.

    That said, I'm not trying to be argumentative, and I realize I may be missing something about MVs that make them superior to just using a transient view or a permanent table maintained by triggers or ER. So, please, feel free to enlighten me. I won't let you off easily if your arguments aren't cogent, but I'm open.

    Certainly MVs would help with porting Oracle apps and schema to Informix. Something I think we're going to be seeing more and more of in the aftermath of Chase Bank's online banking site meltdown(Powered by Oracle I hear) last month.

    So, what is really missing?

    ReplyDelete
  9. Hi, thanks for answer. I'm new in Informix (working with this DB since 11.5). Where can I find any description how to provide in Informix equivalent of vaterialized view with incremental refreshing. If I well understand full refresh I can obtain using SPL procedure and running it as database task - em I rigtht ?

    ReplyDelete
  10. No. I'm saying you can emulate an MV using real tables and triggers. Each of the constituent tables would have an INSERT trigger, an UPDATE trigger, and a DELETE trigger that would maintain the rows in the table that instantiates the VIEW (let's call it a view-table for simplicity). The triggers' actions could be to execute a stored procedure or UDR to accomplish the sync though. Since Oracle supports triggers, I won't go into details on this.

    Another option would be to use Informix's Change Data Capture (CDC) feature - didn't think of this one before - to maintain the view-table. CDC is an API that you can use to hook an application into the server's logging mechanisms so that the engine will send all data modification notices to the application. That CDC application could then maintain the view-table in near real-time just like Oracle does. You would write it as a daemon that's always running and it awakens periodically to process the CDC messages its received.

    Hmm, maybe I have the makings of a commercial product here... $$$ ;)

    ReplyDelete
  11. Fernando, Kernoal:

    Are you aware that 11.70 includes the ability to upgrade the Informix version for any MACH11 cluster (including a simple HDR Pair) with no downtime? The only glitch is that users will be disconnected from the server(s) a couple of times and will have to be reconnected either manually or through their interaction with the Connection Manager as the servers in the cluster are each shutdown and restarted with the new version. So, while individual sessions may experience brief interruptions, the server's data is always available during as the upgrade is rolled from server to server in the cluster. What happens under the hood is very interesting and nearly magical, but it happens. I don't think that Madison wants the details posted here, so I'll leave it at that.

    ReplyDelete
  12. Art,

    Yes I did go over this in detail at IOD with Madison. However this is still not exactly what I would like to see. The method that was demonstrated at IOD converts HDR/RSS into ER. Don't get me wrong, this is a GREAT feature. It does allow for a rolling upgrade, but you are left with ER. I want to continue with HDR/RSS for just DR. If I converted my HDR and 2 RSS servers into an ER topology, I would then have to pay for full licenses for all 4 servers.

    I am not wanting an online or rolling upgrade with HDR/RSS. What I would like is to be able to upgrade an Informix HDR/RSS cluster without having to perform the backup/restore to get replication restarted. An outage is acceptable. There needs to be some way to upgrade the primary and maybe produce some type of change file to apply to the secondary server so it could eliminate the backup/restore to start replication over.

    As it is now to upgrade and HDR/RSS cluster, you have to abandon replication, upgrade and start over from the beginning with replication.

    ReplyDelete
  13. I received word today that the feature request we want will be targeted for the next major release (11.50.xC8 and the next 11.70)

    Here is the description for the idsdb#, "Enable a method to upgrade HDR/RSS to a new version without having to do backup/restore to restart replication "

    Sounds good to me!

    ReplyDelete