ASE Performance and Tuning Question and Answer Log
When creating an index on a temp table within a single stored procedure, the optimizer actually will consider such an index for use. It appears that the proc is basically being 'recompiled' at run time. Is this correct? (i.e. Basically the same 'cost' is added to such a proc as one that was created or executed 'WITH RECOMPILE'.
Answer: What's happening is that the 'create index' statement is indicating a change to the 'schema' of the table. The index will be create and immediately following the procedure will be recompile. Processing of the procedure will pick up again just following the 'create index', thus the index and it's statistics are known to the optimizer.
Why "enable sort-merge join and JTC" is combined in one configuration parameter?
Answer: JTC was linked to sort-merge join because there was concern that the extra plans that are costed could lead to longer optimization times. However, you can use trace 334 to enable JTC alone. This can be used at the session level or you can boot with it. I strongly suggest you test it out well before implementing it though.
What is the difference between "update statistics" and "update index statistics?"
Answer: The 'update statistics table_name [index_name]' syntax will update/create statistics on the leading column of all indexes on the table or index you specify. You can see which column is the leading column by running sp_helpindex or sp_help on the table. The 'update index statistics table_name [index_name]' will update/create statistics on all columns that are included in an index. This will be done for all indexes on the table or for the specified index.
As a dba, which aspect of the db do you usually check for performance and for diagnostic purposes?
Answer: The field of Performance and Tuning is very large. It covers many areas from query optimization to hardware tuning. In many cases you can find and resolve performance issues at the query level, this is a good place to start. If you find all is well at this level move up.
What part of a stored proc's query plan are figured out during run time? Also, are stored procs sometimes recompiled during runtime to refresh their query plans?
Eric, is there a status on the release of SDTC? I don't think this is the current name anymore, but I'm referring to the Sybase Monitoring Tool that was suppose to be GA with ASE 12.5 Thanks, Joe
Answer: I spoke to Product Management and was told that there will be a public announcement soon, but was not given a date. Stay tuned.
What is data skew's role play in the optimizer?
Answer: The term "data skew" refers to a column that has a small number of values occupying many rows while many value occupy few rows. This results in 'spikes' in the columns histogram which are represented as Frequency Count cells (they represent only one value and are highly accurate because their weight is the exact percentage of the column occupied by that value). Another effect of "data skew" on the statistics has to do with the Total Density value. This value is used when costing a join of the column. The averaging technique used to derive the Total Density value is disproportionally effect by highly duplicated value. This can, in some cases, result in the optimizer estimating that more rows will qualify for the join than actually will. If this is the case it is advisable to change the Total Density value using optdiag or sp_modifystats. Keep in mind that all joins of the column will use the Total Density, so test before implementing.
If 60% of the data change daily, how often should you modify the statistics?
Answer: I assume you mean 'how often should you update statistics'. In a case like this there's most likely no reason to modify statistics. The answer to 'how often to run update statistics' is really "it all depends". It depends on the distribution of values in your dataset and on your queries. If a 60% daily change is not having an adverse effect on the performance of your queries, then there's no need to change the frequency of your update statistics. On the other hand if such a change to the date is having an adverse effect you should consider running update statistics more often.
Why would you run update stats on a single column as opposed to the entire table?
Answer: I'm assuming you're referring to running update statistics table_name (column_name) as opposed to running update statistics table_name. Update statistics table_name will update statistics on the leading column of each index on the table in this case. Where as update statistics (column_name) will update or create statistics only on the specified column whether or not it's part of an index.
How do you clear stats if update all stats was run?
Answer: Run delete statistics table_name (column_name) for all columns you want to remove statistics from.
Can running update all stats hurt?? - And - If we have the maintenance window, is there any harm in running update stats all?
Answer: Depending on the number of columns and rows, it has the potential to add a great deal of time to your maintenance of the statistics. Also, you may not need statistics on every column of the table.
I'm the not admin on my database. Are there any sp's to show me if stats are on lead columns and/or on all composite columns? How about on columns not in an index?
Answer: Normally you'd want to use optdiag to do this. But, you need to be SA to use it. See if you can get permissions. The alternative is to get hold of sp__optdiag. It is not a Sybase supported pro, bit it's very useful to view that statistics without going to the command line to use optdiag. You can find it at http://www.sypron.nl/download.html
I've noticed on large tables that are very volatile (many inserts/updates/deletes) update statistics stops working at some point and we have to rebuild all the indexes. Is this common and why does it happen?
Answer: No, this is not common. It sounds like a bug. I suggest you call Tech Support with this issue.
How does dropping and re-building of indexes compare to update index statistics
Answer: First off, when you drop an index in 11.9.2 or above the statistics remain on the column since they no longer belong to an index. When you create an index the statistics for the leading column are updated or created. Basically, update statistics on the leading column is run.
How frequently should you update statistics? I think you said it was a one time deal, but I just want to check.
Answer: How often to run update statistics is dependant on your date, how it changes and the queries. There is no 'rule of thumb' that can be applied to all situations. You need to test over time to determine what the most optimal schedule is for running update statistics in your environment. The ONLY case in which you need to run update statistics only once is when you have a static table who's data does not change.
Pat Shaughnessy Asked: Are statistics static or gathered and stored real-time?
Answer: The column level statistics, those that measure the distribution of values in the column, are static. They need to either be updated or written in order to change. Some of the table/index level statistics are maintained dynamically in memory and are changed as your data changes. These include the row (index and data row) counts, the page (indexpage and datapage) counts, the deleted row count, the forwarded row count (DOL tables only), and the cluster ratios. Other table/index statistics such as empty pages, data row size and index height are static and are updated when you run update statistics.
Gisha Babby Asked: How do you determine argument selectivity?
Answer: It's usually difficult to estimate the selectivity of a search argument on your own. However, you can easily tell how selective and column/index are by examining optdiag output. Look for steps (cells) with relatively high weights. This is will indicate highly duplicated values. The more highly duplicate values the less selective the column, and thus any index it's in, will be.
Helen Cao Asked: does this mean that we should add statistics to all columns? how often do i need to update the statistice on all columns?
Answer: No, it is rarely necessary to have statistics on all columns of a table. However, it is advisable to place statistics on all columns of your composite indexes (those indexes with more than one column in them). Of course, test before implementing in production.
Cal Ranjan Asked: What are the trace flags to see the execution plan of a query/ stored procs/force indexes ...
Answer: The traces used to view the optimizer's costing logic and the query plans that were considered are traceon 302 and 310. There's also showpaln See the P&T Guide for specifics on how to use and read them.
Joe Janhonen Asked: How do histograms represent composite indexes?
Answer: Histograms do not represent composite as a single object. Histograms represent the distribution of values in a single column. If all columns of a composite index have statistics on them the optimizer will have a much clearer picture of the selectivity of the index as a whole. The multi-column density value is gathered for all sequential sets of columns in an index. This value is used to further measure the selectivity of a composite index.
Peter Schonberger Asked: are there other than -T1 flags for optdiag that we should know about?
Answer: Yes, there are a few other traces that can be used with optdiag. Check the P&T Guide for specifics.
Sharon Shields Asked: If I update statistics on all composite index columns once a day on a very active table, how useful will this be?
Answer: It is possible that updating statistics daily will be helpful. But, you will have to test it to determine how often to update statistics in your environment.
Matt Rogish Asked: On slide 32, we saw the 40,000 value taking up a large amount of space which messed up the stats. Are there plans in the next version to 'smooth out' or remove the abnormally high/low values so we don't have to modify stats manually?
Answer: The value 40,000 in the example was represented by a Frequency Count cell in the histogram. Its weight was considerably higher than the other cells. This is a new and very accurate way to represent highly duplicated values in the histogram. When the optimizer finds that it's using Frequency Count cells it knows it doesn't have to do more estimates. However, the highly duplicated value does have an effect on the Total Density value which is used in estimating the cost of joining the column. This is being addressed in a new version of ASE.
Ivica Curak Asked: I have a join, including 16 tables( 22.214.171.124). It takes 22 sec to optimize ( I guess time is spend on costing) to execute the very first time. The next execution is fast ( 16 ms). Why does it take so long to optimize it?
Answer: It sounds like this is a stored procedure. In the first run all possible plans will have to be costed with the estimated cheapest one chosen to be executed. On the next execution of the proc the plan is reused and does not go through optimization. The more tables in a join the more plans that need to be costed. The number of possible join orders alone for a 16 table join is very very high (N!) and that doesn't include index aceesses.
Steve Caffey Asked: Out of bounds costing - both your solutions require that you know an out of bound value or have a window where the stats can be fixed. What if data is quickly inserted/deleted many times within a process and we occasionally have the out of bounds problem? Also - do the stats have to have a lower/upper boundary - can it just have a less than or greater than clause?
Answer: In a situation like you describe I would suggest you use a 'dummy' row in the data itself that his high or low enough (depending on whether it falls outside the low end or high end of the histogram). This will eliminate out of bounds costing. The histogram must have an upper and lower boundary in order for it to describe a set of values. However, your idea of using a less than and greater than at each end of the histogram is ineteresting.
Rodney Stromlund Asked: I'm not admin on my database. Are the sp's to show me if stats are on lead columns and/or on all composite columns? How about on columns not in an index?
Answer: Use optdiag to see which columns have statistics and which are members of an index.
Anne Mahoney Asked: is there any concept of dropping all statistics to force 'rule-based' optimizer behavior?
Answer: ASE's optimizer is primarily cost base. However, there are rules based aspect of it. If you remove all the statistics the optimizer will be forced to use the 'default selectivity values' as the selectivity of the search argument or join. But, it will not use classic rules based optimization. Most of the major RDBMS vendors are moving to cost based and away from rules based op
Kevin James Asked: Does the physical layout of a table (fragmented pages, devices, etc.) have an effect on update statistics values?
Answer: No, not on the running of update statistics. However, table and index fragmentation will be reflected in the cluster ratios, and 'Space utilization' values.
tom elder Asked: How abut running optdiag on a replicated DB and using its statistics for the source DB?
Answer: This will work - replicate the data, run update statistics on it, get an optdiag file, read the file into production (make sure server name is changed in the file). Keep in mind that this will only write the column level statistics. From time to time you'll need to run update statistics to update the table/index statistics.
Suzan Bennett Asked: Create Indexes with 0 values after upgrade to keep original stats, yes/no? How?
Answer: Using the create index with 0 values syntax will create the index but not gather or write statistics. You should only use this syntax if you don't want the statistics to be overwritten, usually in cases where you've modified them. After you upgrade from a pre-11.9.2 version run update statistics as soon as you can. This will put the statistics into the new format.
Mark Anderson Asked: If we lose customized step count by deleting stats, won't they be put back the next time we update stats?
Answer: If you delete statistics and then update statistics on the column the number of steps used will be the default of 20. If you need to delete statistics I advise that you check the number of 'Requested steps' in the optdiag output and use that number when you next update the statistics for that column.
Peter Schonberger Asked: if ALL stats have the default 20 steps, then is it true that delete stats has no adverse effect?
Answer: Yes, when you recreate statistics 20 steps will be used.
Angavai Krishnamoorthi Asked: How can I delete statistics?
Answer: Delete statistics table_name [(column_name_list)]
Barb Walrond Asked: after upgrading from 126.96.36.199 to 12.5 we had to recreate all the indexes to get the queries to run the same - why would update stats not work?
Answer: This sounds like a bug, call Tech Support and see if they know if it's been fixed.
Gerald Justice Asked: When will there be composite statistics on composite indeces? Wouldn't these help?
Answer: There are the multi-column density values that are stored for all subsets of columns in a composite index. These are used in estimating the cost of the composite index when multiple columns are referenced in the query. However, multi-column histograms are not stored. At present there are not plans to add them.
Anand Jayaprasad Asked: where does one get a detailed information on using Optdiag ?
Answer: The Performance and Tuning Guide has a good section on optdiag. Also, if you go to www.sybase.com then to Support and then to Technical Documents you can search on 'optdiag' for white papers and other resources.
Eric Larson Asked: How can we be more proactive in gathering statistics? Is there anything in optdiag that would indicate that a table is in need of statistics to be gathered?
Answer: No, optdiag will not tell you that statistics need to be updated. Knowing what the optimal interval for running update statistics is usually done by trial and error. However, if you have a commonly used atomic columns, such as a date or identity column, you can check the max value in the histogram using optdiag to see how close it is to the actual max value in the column.
Joan Linker Asked: Is it necessary to run update statistics after you upgrade to 12 from 11.9.2?
Answer: No, it's only necessary if you upgrade from pre-11.9.2
KUMAR BALARAJU Asked: Is it not good to have recent statistics. Why do we ever need old statistics. Is not optimizer better do its job with latest statistics than old.
Answer: In the vast majority of cases I've seen involving dynamic tables having recent statistics results in efficient query plans. In rarer cases I've seen older sets of statistics, or removing statistics all together result in very good plans. You'll need to test you data and queries. If running with recent statistics works for you, I doubt you'll need old statistics.
CLAUDIA FERNANDEZ Asked: If your stats are up-to-date, you have fine-tuned ASE configuration, how to make sure that the SQL statements are using a good query plan?
Answer: Use showplan along with traceon 302 and 310 to diagnose the query plan. See the Performance and Tuning Guide for details on how to use these tools.
Cal Ranjan Asked: Does Sybase Optimizer understand/recognize Star Schema?
Answer: ASE will process Star Joins, but it doesn't specifically look for them.
Susan Howard Asked: Should you review the cluster ratio to determine when to recluster a table? Is there a tool to make reclustering very large tables less time consuming?
Answer: Yes, the cluster ratios are one set of values in optdiag you can use to monitor fragmentation of the table or index. Others are - empty pages, deleted rows, forwarded rows (on DOL tables only). In ASE 12.0 and above the new values - space utilization and large I/O efficiency can also be used.
Curt Nothwehr Asked: Have you ever seen optdiag index height -1, and if so, what does it mean and what causes it? We've had problems with invalid query plans for tables accessed with this condition. And - Why do we get negative values in systabstats ? does this affect performance ?
Answer: There have been a couple of bugs dealing with negative numbers in the statistics. Call Tech Support and ask them in what EBF they were fixed.
Bob Doss Changes Question To: Why is't partition statistics updated when update statistics are run?
Answer: Historically update partition statistics has been separate from update statistics (until the advent of update all statistics). If you don't have partitioning it will add a small bit of unnecessary work. On the other hand it could be useful. Call Tech Support or go to the ISUG website (isug.com) and log a feature request.
Pranav Deshmukh Asked: How does statistics work when a table is partitioned and it has row level locking on?
Answer: The only difference in the statistics for a DOL (row level locked) table is that the number of forwarded rows is taken into account, there will be two I/Os for each. Information on the size of the largest partition and partition skew are taken into account by the optimizer. This information is used to cost queries running in parallel.
CLAUDIA FERNANDEZ Changes Question To: Can I dump/load stats from a different ASE (ie. production) into another database (ie. development) to optimize SQL and simulate production?
Answer: Yes. In fact this is the best method to use because the dump is a snapshot of the production DB. Lodaing the dump, for example, will not remove any fragmentation that exists in production.
Alberto Asked: Is there any good tool right now on the market which can help to tune SQL?
Answer: Yes, Sybase's SQL Expert is a very good tool for tuning your SQL. Check www.syabse.com for more information.
Yasmeen Ahmed Asked: Is there a difference in how DOL tables and Page lock table update statistic works
Answer: On a DOL table update statistics can run in isolation level 0, thus using dirty reads and making less of an impact on users.
martin skye Asked: What is the bug problems with update all statistics.
Answer: There are no bugs with update all statistics that I'm aware of. The only issue with it is the time it can take to run.
Eduard Mostert Asked: In how many cases, based on your experience is update index statistics preferred above regular stats
Answer: In most cases where there are composite indexes I've seen more accurate cost estimates from the optimzier when there are statistics on all columns of the index.
Judy Scheinuk Asked: If I have added statistics to the interior columns of an index and I then run update statistics, what happens to those interior column statistics? Do they get updated, go away, or remain unchanged?
Answer: Update statistics table_name only effects the statistics of the leading column of indexes. The statistics on inner columns of your index will remain unchanged unless you run update index table_name or update statistics table_name (column_name)
Raymond Jones Asked: Is there an sp that gives a simple summary of what columns in a database have statistics and when they have been updated? (Palmeel@dshs.WA.gov)
Answer: Optdiag will give you this information.
Peter Wu Asked: Could Sybase boost up the performance of update index stat using the paralleling processing ?
Answer: Yes, running update statistics in parallel can reduce the time it takes to run.
Susie Richardson Asked: When you create an index on a new, populated table, do you get statistics on all columns of the index or just on the leading column?
Answer: Create index will only write statistics for the leading column of the index.
Joshua Doherty Asked: I was wondering if you might be able to explain a strange behavior I witnessed while joining three tables together in one of my queries. If I joined table A to B, and B to C, the query took as long as 10 minutes to run. If I then added code that joined table C back to table A, the query ran in only a matter of seconds. I was under the assumption that the order of the joins didn't make a difference, but apparently in this case it did. The output of each query was the exact same, so I'm not sure what might have happened. Any ideas?
laksh sridhar Asked: What are the pros and cons of putting indexes vs adding to stats for col in a join?(email@example.com)
Debbie Herold Asked: Can you explain the benefits of the sort-merge join functionality added in 12.0?
Answer: SMJ can be very efficient compared to nested loop join in some situations. SMJ uses ordered index columns to avoid sorts and to more easily access the qualifying rows of the join. However, it is designed for DDS type queries and DBs. You need to make sure your indexes are designed to take advantage of SMJ.
Dennis Chan Asked: What is tempdb used for and how do I calculate the size of tempdb to be used?
Answer: When running update statistics on any column other then the leading column of an index a worktable will be created to sort the column values. This worktable is created, sorted and read in tempdb. I would suggest you take the maximum size of a value in the column and multiply that by the number of rows in the table. This will give you a good estimate of the space needed for the worktable.
Can stats be created on tempDB tables and how
Answer: Yes, if the table is a regular table in tempdb you can do what you'd do with any table. If it's a # table you can run create and index in a proc that creates the # table. This will recompile the table and the statistics will be seen by the optimzier. However, this won't work for a batch query.
Is there a specific cluster ratio column in the sysstatistics table?
Answer: The 'CR count' values that are used to calculate the three cluster ratios are each stored in their own column in systabstats - dpagecrcnt for the Data Page CR count, ipagecrcnt for the Index Page CR Count and drowcrcnt for the Data Row CR count.
Where can I get your white papers on how to read the 302 310 output. And - Where can I find a white paper doc on table level lock and row level lock?
Answer: Look in the "Technical Documents" section under Support at www.sybase.com (http://www.sybase.com/support/techdocs)
Max Scott Asked: What happens to column statistics when I truncate a table?
Answer: They remain on the table. This can be useful when you are truncating and populating a table frequently. While the remaining set of statistics may not completely reflect the actual data, it will save the time of running update statistics.
Ruth McNay Asked: Is there any way to fix the optimizer for date column indexes so that they don't have to be run every day. We have a table that has 14 days of data with about 400,000 rows per day. Before when it figured the distribution and didn't have values it wasn't bad but now we end up forcing indexes to get good performance.
Answer: I'm not sure if you're saying that before you place statistics on the column you had better performance. If this is the case then I suggest you delete statistics from the column and run without them. The other approach to try is the either placing a 'dummy' value in the data or in the value of the last step of the histogram.
Dan Sperka Asked: Should join columns be included in composite indices? If so, should they be first, or where in the index
Answer: This is a tuning decision. You'll need to test whether to place a join column in an index, make it the lead column or not index it at all. In most cases making the lead column works well. If you're using Sort Merge Joins then making the lead column is a necessity.
What are the effects of interrupting update stats?
Answer: If update statistics is interrupted the statistics that have been gathered up to the time of the interruption are written. You'll have some statistics that are up to date and some that aren't.
Maria Maskell Asked: Are there any plans to be able to run update statistics that do not block users?
Answer: As of 11.9.2 update statistics doesn't take out an exclusive table lock. Also, if update statistics is run on a DOL table isolation level 0 will be used to further reduce effects on users.
How does the optimizer affect dynamic SQL from a PowerBuilder application?
Answer: Once the dynamic SQL enters the server it is handled the same as any SQL and passes through all phases of query processing, including optimization.
We've noticed that having 16K pools in the data cache causes a performance degradation when the optimizer uses the 16k pool for the qeuiry plan. why is it so ?
Answer: This is usually due to poor clustering of non-clustered indexes. Check optdiag output for the cluster ratios of any non-clustered indexes that are not being used with large I/O.
Brian Asked: I've gotten notes before like with 11.0.3 from Sybase saying that trace flags to revert back to previous optimizer logic. 304 for example.
Answer: Yes, there are several individual trace flags that effect various areas of optimizer functionality, but there no single one to revert back to an earlier version. Some traces remain in ASE from version to version, some are taken out quickly. I advise that you always get a full explanation of what any trace does, why you need it and what if any danger there is in using it.
Richard Asked: I have a large table with an index on a todays-date column. I know the approximate number of rows that get added on a given day. Running update stats takes a considerable amount of time. Is it reasonable to update the last cell of the column stats, or create an new cell at the end of the stats that includes the approximate daily volume? I'm hoping this approach will provide more reasonable stats for the period between our update stats runs.
Answer: Yes, this is a reasonable approach to take between runs of update statistics. Just make sure that any values you add to the histogram are tested.
vijyant Asked: can we bcp out systabstats/sysstatistics and bcp in to a a different server
Answer: The only problem with this approach it that the object ids will have to be the same in both servers.
Joseph DeFilippis Asked: How can you tell which columns have stats on them? and Viren Parikh Asked: Is there a way to know what the histrogram looks like on a table?
Answer: Use optdiag or sp__optdiag (not a Sybase supported proc) to view the histogram the column.
Hanteng Dai Asked: There is another myth: Cursors are bad for performance. Do you agree? It all depends?
Answer: No, I don't agree with this. And, yes it will all depend on a number of factors. You may need to tune cursors just as you would any query.
Peter Wu Asked: How do we decide which columns need statistic ?
Answer: It's recommended that you test your queries with statistics on all columns of your composite indexes. Also, you should test your join queries with statistics on any column that is being joined that does not have statistics. The testing of your queries using various sets of statistics is a part of your performance and tuning efforts.
Kanika Asked: do we need to recreate indexes time to time instead of doing update statics on the tables which have high I/o. Does it makes a difference.
Answer: Create index will gather statistics for the leading column, it will not effect other columns. So, in terms of the column level statistics there is no need to drop and recreate indexes. However, doing so will defragment the table/index and this will effect the table/index level statistics.
Kobi Lifshitz Asked: I am looking for minimalist update stats that would be relatively highly accurate (> 95%) but would be less intrusive by simply sampling some of the data rather than read all (depending on which update stats options is chosen) pertinent data?
Answer: A sample that's >95% would require that much if the table be read. Sampling by its nature does not result in a completely accurate representation of the column. However, sampling can result in statistics that are accurate enough to result in efficient query plans. The webcast was not designed to discuss ASE futures. If you attend this year's Techwave there will be presentations of ASE futures.
Ramon E. Arcilla Jr. Asked: When you upgrade ASE to the next higher version, do you need to rerun the same update statistics on pertinent columns?
Answered: This is the recommended process. Once you're satisfied that things are running well it's also recommended that you then test any new functionality. We do not anticipate changes to the statistics in the near future.
malik ishaq Asked: are index stats automatically run if a table is dropped and recreated along with its index
Answer: If the table is dropped, recreated and repopulated statistics will not exist. If you create an index you'll get statistics on the leading column.
Jerry Glenn Asked: I was under the impression that under 11.9 and beyond, the update stats no longer lock tables or otherwise affect users on the system. Is that not true?
Answer: It will not take out an exclusive table lock. If it's a DOL table update statistics will run in isolation level 0 and not effect users as much as in previous versions.
Helen Lenng Asked: When histogram is old and values out of range for query when does optimzer use selectivity=1? You stated it uses either 0 or 1; then does it use 1 - the worst choice
Answer: If the search value is less than the minumum value in the histogram and the operator of the query is >= and if the search value is greater then the maximum value in the histogram and the operator is <= the selectivity of 1.00 will be used. Otherwise a selectivity of 0.00 is used.
Dave Smits Asked: Is increasing the amount of steps in the histogram a pretty common practice that produces noticable results? It seems like it's use is probably pretty rare.
Answer: It is a fairly common procedure among Customers. In many cases it does have a positive effect on the optimizer's choices. If a dataset was upgraded from a pre-11.9.2 version, and update statistics were run, then the number of requested steps will be the number of steps in the old distribution page. This is usually more than adequate.
malik ishaq Asked: what is the criteria for getting selectivity - is it only total density or other stats are used to
Answer: For search arguments both the histogram the Range Cell density are used. For joins the Total density is used.
Santhosh Asked: How about getting the stats from warmstandby to the primary ?
Answer: This will work. Use optdiag to get an output file from the standby and read it into the primary.
Anjali Bhatnagar Asked: How to change value using optdiag?
Answer: You read an edited optdiag output file into ASE. See the P&T Guide for the 'how to'.
Will having stats on all columns bog the server down? and Does increasing the number of steps for statistics have any negative impact on performance ?
Answer: This is not likely to happen. You may see the parse and compile time increase slightly as more histogram steps are added, but this is usually a matter of a few milliseconds. The more accurate query plans you get from the availability of more statistics is a good trade off.
Max Scott Asked: Are there special considerations for obtaining statistics on columns that have nulls for a large number of the rows in a table?
Answer: No. The NULL values will be read as any real value will be. The one difference is that NULL values are only represented by one step (cell) in the histogram. Step (cell) 1 is where the NULL values will be, if the weight of cell 1 is anything other than 0 then that value is the percentage of NULLs in the column.
Doug Thomas Changes Question To: Is update all statistics good to do an initial migration?
Answer: No, you should only use update all statistics if you know for sure your queries will benefit from statistics on all columns of the table. Update all statistics can take a very long time to run.
What is traceon 302?
Answer: Traceon 302 prints out the optimizer's costing logic during the index selection pahse. See the P&T Guide for information on running it and reading the outputs.
Peter Veilleux Asked: where is the existing step count value stored?
Answer: It's stored in systabstats.
Jasvinder Singh Asked: what will happen if I have 4 indexes and they have same cols in the index on running update index stat on all these indexes.
Answer: Update statistics will only need to read the column once. Since the distribution statistics belong to a column and not an index they are stored only once.
Patrick Sand Asked: can we use this optdiag trick for ASE 12.0?
Answer: You can use the optdiag 'tricks' in all versions from 11.9.2 - 12.5
Dave Smits Asked: Is the histogram useful when 3/4 of the values of the column are the same value?
Answer: Yes it is. In such a case that single value that occupies 75% of the column will appear as a Frequency count cell in the histogram. The other values will be represented by other cells. The Frequency count cell will be very accurate. In some cases you may find that having no statistics and using the default selectivity values works better, you'll have to test to find out.
Mahesh Asked: Why do we need to update statistics? Can't ASE automatically keep track of these stats (I beleive ASA does so?)
Answer: In order to keep track of the column level statistics ASE would have to read all changes to the column and write these as they happen. This is very I/O intensive. ASA will gather column statistics dynamically, but in most cases ASA datasets are much smaller. However, this would be a good feature request for ASE.
Randy Cregar Asked: If you don't put the index name for an update index statistics...is that okay ?
Answer: Yes, it's OK. What will happen is that statistics will be updated/created for all columns of all indexes on the specified table.
Kanika Asked: In which table will the statistics be written, in the column statistic table or the other one
Answer: When you run any form of update statistics bot the table/index and column level statistics will be gather. Thus, statistics will be written to both sysstatistics (column level) and systabstats (table/index level).
malik ishaq Asked: I have had situations where update all statistics worsens performance and index statistics helps performance - how come?
Answer: In some situations having statistics on very dense columns (columns with a very high degree of duplicates) can result in worse query plans than if the default selectivity values are used. I suggest dropping the statistics, as I assume you did, and then just create statistics on the indexed columns. Basically, there are times when not letting the optimizer have a clear view of your data can be beneficial.
Aleksey Asked: How often do the table/index level statistics change? And - Is there a way to automatically reflect changes in dataset into statistics? If not what is the reason to have not this feature?
Answer: Some of the table/index statistics are maintained in memory - row and page counts for the table in indexes, deleted rows, forwarded rows for DOL tables and the CR counts (used to calculate the cluster ratios). As DML occurs these values change. The optimizer will read these values from memory. From time to time they're flushed down to systabstats by - housekeeper, running update statistics or optdiag or sp_flushstats or at a clean shutdown. Currently, all other table/index level and column level statistics are static and need to be updated with update statistics.
Chris Howland Asked: Is this expected: Upper and lower bounds are too close for interpolation to be accurate.
Answer: Under certain circumstances the two SARG values of a range SARG can be too close for the interpolation function to work well (it estimates how much of a step or steps will qualify for a range SARG). In this case other forms of costing will be used by the optimizer.
Rajiv Saggere Asked: When joining multiple million row tables on the indexes why does the optimizer sometimes not to use the indexes even with fresh update statistics?
Answer: I suspect this has to do with highly duplicated values having an adverse effect on the Total density value. I suggest you take a look at the value(s) and change them to a lower number using wp_modifystats or optdiag.
Blair Saunders Asked: Prior to 11.92 wasn't selectivity and density an inverse relationship of each other? If so, why do both need to be stored, because knowing one would imply knowing the other.
Answer: No, this has never been the case. Selectivity is estimated by the optimizer, it's not stored. Density is a measure of the average number of duplicates in the column. The histogram may not show duplication unless it has enough steps (cells). You may be thinking of how ASE used to report join selectivity as opposed to SARG selectivity in pre-11.9.2.
John Wu Asked: if the step count increased from 20 to 200, how much longer will the update statistics going to take??
Answer: The amount of difference will be so low you won't be able to perceive it.
Greg Morse Asked: You said in answer to another question that you do not keep stats on text cols. Most of my indexes are composite with the leading col, a smalldatetime, and the next col a char(10), or a char(4). Would using update index statistics be of any value to me over and above a simple update statistics?
Answer: Yes, it would most likely help. When I answered that question on the phone I was referring to column of datatype "text" not char/varchar. Char/varchar columns can have statistics.
lou fridkis Asked: From: If update index statistics runs out of tempdb space, is there a way to estimate how far it got so we know how much bigger to make tempdb?
Answer: Multiply the size of the datatype by the number of rows and add a small 'fudge factor', say 10%. This will give you a 'ballpark' estimate of the size of the worktable needed.
shu huang Asked: Are column statistics gathered more than one time if the column appears in more than one indexes for update all statistics?
Answer: No, column level statistics no longer belong to an index they belong to the column. They will only be gathered once per run of update statistics no matter how many indexes the column is in.
Ramesh Narayanan Asked: Do we have to estimate how much space is needed to store statistics in the database?
Answer: No, the space to store statistics is minumal.
Bill Auslander Asked: It wasn't clear to me. If you run update index statistics on a table, then is it redundant to run update stats on the table?
Answer: Yes, because update index statistics has already updated the statistics of the leading column of the index.
John Grundy Asked: Is it advisable to run sp_recompile after any update statistics? Can optdiag determine the max steps the server would use for a column, if not, is it advisable to increase steps until you determine this value?
Answer: Running sp_recompile after update statistics is a good idea. Optdiag will show you the actual number of steps used to create the histogram. The number to request via create index or update statistics is a matter of tuning. You'll need to test to see how an increased step count effects your queries it a matter of trial and error.