Query Processing Innovations for data intensive, modern applications : Build 2018


>>Thank you for joining. So what you have up here is the Query processing program manager team. So the three of us, i’m joe sac, I’m program manager. We have shreya and ken. The three of us work with the query processing engineering Team for sql serve r and azure database. Everything is amicable to both worlds and may expand beyond That as well. In terms of the subject matter, So we start off with talking about graph. So we added graph and sql server In 2017 and in azure sql database. How about folks had a chance to Work with graph in sql server? good. A lot of people this will be new. We’ll cover adapt query processing. This is a set of features, a new Feature family in sql server 2017, and the essence of it is You should be able to have performance improved based on Runtime characteristics. Everything that you learn about Today is actually a good baseline for what’s coming up Next in the future. So living in a cloud world we’re Able to get innovations out there more quickly. So even if you’re familiar with some of the things that we cover Today, it’s good to have a solid baseline because we’re building On that over the next year, and we’ll talk about that at the end. With that, shreya. There you go.>>All right. I’m a program manager with the Sql server and azure sql server database team currently focusing On the graph processing feature with 2017 and they’re available On azure sql database. Today i give you an intrukdz Introduction of the features are and a quick demo of what to to Do with graph processing features in sql server. How many know what is a graph database? A graph database is nothing but a collection of nodes and edges. Nodes represent the entities and edges represent relationships, And you might ask, so what is different from a relational Database, then? in a relational database tables Represent entities and keys represent relationships? The difference is in a graph database an edge or relationship Can flexibly connect in nodes to many other nodes in your database. For example, if you have a facebook-like graph and you have Person node and a likes edge, you can use the likes edge to Connect a person to another person. That means a person likes a person, or you can use the same Likes to connect a person to a restaurant or organization. So it’s heterogeneously connecting different nodes to Each other. Edges are — you can easily Model many relationships using these edges as opposed to Foreign key relationships that let you model one too many Relationships in a relational database. Graph databases give you easy or Regular expression-based coding language to easily write your Travel queries or navigate queries and support them like Relation al databases. So why are we Doing this inside sql server and not building a We talked to several customers before we build features in sql Server, and all of them unanimously said that we want Something on the same platform so that we don’t have to perform Expensive etl operations and don’t have to pay for separate Licensing keys, and we could do it all on the same platform. And then we thought that we had a mature product, sql server and Graph databases are mature products. We have seen 40-plus years of Industry and academic research. We have mature tooling and Eco-systems, and a lot of community support from the community. We are trusted. Millions of enterprise customers Trust us with their data. They use us for mission-critical Applications, and you can build both on prem, cloud and hybrid Applications using sql server. -Prem, cloud and hybrid Applications using sql server. It shows that they’re capability To do graph processing with dedicated databases or even Better in some cases, you know? so with that now let’s finds out What’s new in sql server 2017 and azure sql database. You can now create a graph, which is a collection of nodes And edges. We model nodes and edges in the Database as stables, so when you create a node type, like, let’s Say, person node type, you will be actually creating a node table. If you create an edge type, you create an edge table. Every row in that table represents a node or an edge in Your database. To support creation of graph and To support collating in the graph, we have introduced some Ddl extensions and core-e language that lets you write Navigation queries. With all these you just get the Tooling and eco-system you’ve been using with sql server, and You will use the same coding language and tools for the new Features. You don’t have to learn a new Eco-system or new query language. Like i said, this is completely Integrated inside the sql server engine, so tools like mmms, Backup and restore and export and import, everything just Works out of the box. They can now join the relational And graph tables to each other, so it lets you have your Relational scheme and graph scheme on the same database, and The queries join them together for reporting purposes you might Need. Along with that, all of other Cutting edge technologies that sql server provides like column Store indices, machine learning using python, everything, is Supported on the graph features, so you get everything out of the Box for graph. We support all the security and Compliance features, so dynamic data masking, security and out Of the box for your graph schema or graph tables, too. This is available on sql server as well. So let’s see how you will create a node or what the ddl Extensions look like in the gravity database. If you want to create a node, create a table that is Supported, all indices are supported, all constraints are Supported. And you just add as node in the end to the create table Statement. That’s while creating a noble Table. You may ask if this is just a Table, why do we create this different type of table? Why do we add this node in the end and not just create a Regular table? so when you create a node table Or create table using as node, what happens is along with the user-defined columns that you Have created on your table, we create one implicit column on The note table. It’s called (indiscernible) this Is a unique identifier for every node in your database. It’s a computed column and (indiscernible) or identity value. It’s stored as a beginning on the disk, but it’s a computer Value. If you collect from the table, You see this json string here. That tells you that this is a Node table, and the scheme table name is customers and they name The internal i.D. For that. If you export it using, let’s Say, import/export, you will get these json values out and you Can change the table name there just in case if you want to Import that data into a different table in a different database. And if you don’t create — whether you create this node Table, we create an implicit index on the dollar node i.D. Column, a unique non-clustered index on this column. If you create a unique index of your own, we don’t create that. If the user did not specify any indices, then we add an implicit Index there, when we use later on for faster (Indiscernible). [ Inaudible ]>>The question is did you want to create customer i.D. As a Primary key for the table? there is already a unique Identifier, so why create my own primary key, right? Is that the question?>>no, i get it. I get it. Sorry.>>So the reason is that you can Have your own primary keys also on the table. We don’t stop the users from creating their own primary keys. You can actually link another table to this using primary Key/foreign key relationships. That’s possibly. We create node i.D.S for internet purposes and these are Unique identifiers and we want — if you want to use these Node i.D.S in the way they’re predicates, we will recommend that you use your own keys for That purpose and not use a node i.D. Because it’s a computed Column and it’s json every time it’s computed so performance Might suffer because of that. Let’s look at how you create an Edge. It’s just creating a table. All type of constraints and indices are supported including The clustereded columns and stored indices. An edge table may not have user defined attributes. The edge here does not have any attributes and that’s allowed. Whether you create an edge table we create three on the edge Attributes you have identified. This is the unique identifier For every edge — or each edge in the Database. (Indiscernible), which are hold The node i.D.S for the nodes a given edge is connecting. For example, if you notice here when we created the edge, we Never specified the types of nodes or the nodes that these Edges connect to each other. It’s only when you insert data Into an edge table you insert the dollars for dollar from i.D. And dollar 2i.D. Which tell the engine which nodes it connects. This allows the edges and sql graph to connect any node to any Other node in the database. So you could basically just Create an open edge in the graph and just use that edge to Connect every other node in your database and not have those type Of edges. Yes. [ Inaudible question ]>>The question is how did you get the value for customer i.D. And stock i.D. There, right? you show you how to insert the Data, and that’s how we discuss how to get that. This one just shows that the edge is connecting customers to Stock items and customers bought stock items. So that’s the known i.D. Of the becomes The known i.D. On the stock item. Inserting into a node is very straightforward. You just insert like inserting into a regular table. You specify in the statement the table you want to insert and the Attribute names you can to put in there. When you’re inserting into an edge table, along with the user Defined at trub tribs on your edge, you have to insert values For dollar for i.D. And the 2i.D., Which are the node i.D.S For the node to connect using this edge. In this example we try to connect customer to stock item Using the bought edge. It specifies that the customer bought an item. We start from customer and go to stock item. So our i.D. Becomes the customer’s node i.D. And the two I.D. Becomes stock items and no i.D. In this example i just goat the node i.D. Examples from other Tables in the database where the customer i.D. On my customer Table matched this variable and the stock Item i.D. Matched this. Does that answer your question? all right. Moving on, let’s look at the match predicate that we have Introduced. Match supports our (Indiscernible) navigation and a joint free syntax for navigation Through the graph. It uses our ascii index for Graph traversing. So this is how you are creating A — a traverse query will look like. In this query we find all the Customers that bought a stock item, and the stock item being White chocolate snowballs. We filter the name there, right? If you look at the match predicate there, what we are Trying to do is we are trying to find customers who bought stock Items, so we’re going from customer to a stock item. So anything that appears at the two ends of the arrow there in Match, those are node tables. Those are your node tables. Anything that appears in the (indiscernible) in the match is Your edge table. So you’re going from a node to Another node while on an edge. You always go from one node to Another node. You can’t go from a node to Another node. The direction of the arrow that You see there is the direction of the edge that you specified. If you remember when we inserted values in the bot edge, our i.D. Was customers and our two i.D. Was stock item. We go from customer to stock Item, and that’s what we do here in the example in the query. We go from customer to stock item. I’m going to show you a quick demo of these features or the Syntax, and for that we use this product recommendation scenario. We start from a customer and we assume that customer bought a Stock item, and it was supplied by a supplier, let’s say. It was from a location, and the customer is also from a Location. Then the customer can be from Another customer. Let’s say i use a social media Marketing for which i want to send ads to top influencers in my graph. To friends of top influencers in hi graph. For that i maintain some social media data. Now in this scenario, this is a very straightforward, small Schema that can be easily implemented in relational Approach, and if in this scenario i want to make a (Indiscernible) engine for people, for the scenario where People that bought this also bought this. If you go on amazon.Com, and if You buy something, they tell you in the bottom people who bought This, they also bought this so you might be interested in Buying other items as well. If i want to build an Accommodation engine like that using this simple schema, using A relational approach, the query will look something like that. I have a recommended user buying the product and the customer I.D. And stock item they are buying. From there i find out other Users who bought the same stock item that this current customer Is buying. After that, i bought — i find Out the other items that these other users have bought. Whennive this list, i make recommendations to my current user. Not very easy to extend and not Easy to maintain in the future. Let’s see how this query looks When you implement it using the graph approach. The query will look something like this. This is our recommendation query, and let’s try to Understand what the match clause is doing there. We try to identify all the customers that have bought this Item. This item being white chocolate Snowballs, right? so we get a list of all the People that bought white chocolate snoel balls. From there once we have a list of all the customers, we try to Look at other items that these other customers have bought. Once we have that list, we fill out our current customer. We don’t want to recommend those items and that’s the item they Were buying. We don’t want to recommend it Again. Let’s see a quick demo. This is the query here. All right. So here i have a simple match query, which we just saw where We are trying to find all the customers that bought a stock item. Let me run this, and we get a list of all the customers. A simple match when we go from one node to another node via an Edge, okay? then here we have our Recommendation query. And then we get a list of other Items that other people have bought and the number of times It was purchased in this query. We can make those recommendations. I have a top ten here, so we have a list of top ten. If you remove that, you get a bigger list. And then you say that this is all fine. I can do this match, and it Looks good. What about if i want to do Arbitrary number of them in my graph? If in any graph i want to find friends that are one to five Hops away from hem or one to ten away from me, how do i do this With the new max index? the answer is in the first Version or in this release, the mass index doesn’t support Arbitrary length of travel in the graph. There’s a work-around we suggest to customers if they have that Scenario. For that they can write security That uses match and travels exclusively through your graph. In the example here, i’m going to try — i’m going to try and Find friends of customer 1, 2, 3 hops away, but my max hop value Is 3 here. If i set it to 5 or 10, i get Customers 1 to 5 hops away or 1 to 10 hops away. Let’s run this query and see what are the results. All right. So i run this query and it tells Me people who are one hop away from this, and then people who Are two hops away or three hops away and so on and so forth, right? I find people who are one, two, three hops away from this person In the graph. So if i want to run some — like I said, social media marketing campaign, i can run this and Find the people connected to my top influencers, and then from There i can send that campaign. What if — you ask what if i Want to write a query? i know a couple of top Influencers in my former product on social media. I want to know how they’re connected to each other. If that affects my marketing in some way, i want to research That area, so for that you will want to write a shortest query To find how these two people are connected to each other and what’s the shortest path that Connects them to each other? again, shortest path — we don’t Provide an implicit function for that in the engine in the first Release, but you can write this work-around to calculate the Shortest path. This uses a sql loop work Around. It is a basic implementation of (Indiscernible), and let’s run this query and then i’ll explain To you how this works. If you run this query, we see — I’m trying to find the shortest path between david and emma Here, and we see that david and emma are connected to each other By four hops. This is the path through which David can reach emma, right? if he wants to. I don’t know if you guys are able to see here. Okay. What it is doing is basically Find the first level friends of david for this match, and after It finds the first level of friends, it tries to find if mi Is there in those first letter friends. If emma is not there, then it goes ahead and tries to find the Second level or the first level friends for these people that it Found for in the first level. It stores the results in a Temporary table, and it keeps going until it finding emma, and As soon as emma is found, this filter here actually is for that. As soon as emma is found, we quit, and we return the results To the user. That’s how the query can be implemented. All right. All these demos are in the Resources slides later on. You can have access to them on Github. You can download them and try them. All right. After i show the syntax to Customers and people usually ask me, okay, if match is going to Make my life so easy, should i just stop using relational and Start using the graph approach? i say no. There’s nothing that a graph can help you solve that a relational Base cannot. We just showed you here the Recommendation query. You can write it using the Relational approach and the graph approach. It’s just that in some scenarios Where relationship traverses is the prime focus of your Application, or you have attributes associated with your Relationships also, they’re the center stage for the application. You might want to consider the graph approach, because it will Make it simple, intuitive and easy to maintain in the future. That’s something that you can do. Here are some scenarios that we Have seen people using graph databases for. Recommendation systems that i just showed you, fraud detection And banking and insurance and re Naturally connected to each other. If you want to find fraud rings or something like that, identify The links between the people or the hidden connections between The people, and for that databases work well in those Scenarios. Content management is Hierarchical data. They have hierarchy of contents Or products that make up a product finally. And (indiscernible) travel the hierarchy top-down or bottom-up. These queries become easier to express also. So people are turning towards (indiscernible) databases for That. Crm is marking where it’s Naturally a graph of people to each other with opportunities in accounts. Let’s say people want to find the shortest path between two People who are working on a similar project or a similar Account or opportunity in those scenarios graph databases shine. With that i hand it over to joe.>>Can you guys hear me okay? Great. How many developers in the room? Okay. Good. All right. So i’ve been working with sql server since ’99, and a lot of That time has been spent doing performance tuning. Whenever i have query slow scenarios, recommendations Around what to look at have always revolved around two Things over the years. First thing is weights and Queues methodology, and the second is what we discuss here, Which is the nature of (indiscernible) estimation. If this sounds esoteric, there are practical applications from A pure query tuning perspective. We’ll go through that. Now, 2017, so last year we introduced a new feature family. You might have noticed that the query processing aspects of sql Server and azure sql database has been relatively quiet and Had this renaissance of development over the last couple Of years. We’re going to have more to announce over the next year as Well. We introduced a new feature Family called adaptive query processing, and the idea is that We can’t always make good predictions around whether or Not the plan we put together is going to perform well. Sometimes when we have to make guesses or adjustments to our Assumptions, adaptive query processing allows us to adapt in Realtime based on runtime conditions. We go through each feature. Notice the question mark there. I include this as a place holder that this is just the beginning. We lay the base for a set of innovations, and understanding How these work will be helpful for you down the road as well For what we are going to be introducing next. So you all willingly came into a query processing session, so i Have to subject you a little bit to query processing mechanics. In terms of query processing, i have a query. Let’s say i’m selecting from a table. Select from order table where Product i.D. Equals 10, all right? what happens? We submit that query to sql server or azure sql database. What we do is take the query, we’ll take the text and parse It. We’ll potentially do Transformations and arrange things and remove contradictions And simplify the query where we can. We get to a crucial stage, so The estimation. If i’m selecting from an Ordering table where product i.D. Equals 10, there’s a few Questions around the number of anticipated rows. How many rows in the orders table have that product i.D. Of 10? We have to estimate, and that’s the actual row count flow that Impacts the kind of plan that we create. Let’s say i estimate one row. I say, you know what? there’s only a match on one row, And we base this estimate on the statistics associated with the Product i.D. So we go ahead and cost the plan. This is happening under the covers, so we have an idea of Role flow based on the query you just submitted. We start costing alternatives so there’s different things to do. We have different joint algorithms and different way to aggregate. We have order of operations, so the query op meiser and goal is To get a good enough running plan to be created fast enough Where it’s worth it and doesn’t outweigh the actual waiting Forever to get the perfect plan. We cost different alternates and Select the one with the lowest cost. That’s cached in memory. If i execute this query for the First time where product i.D. Equals 10 and once we cache the Query, it’s the initial compilation. We go ahead and use that plan Based on the match from the text and we pull it from the cache And get ready to execute. At this point we will pull the Memory we might need if we have operations for that particular Select that require memory. We’ll look at degree of Parallelism, and we go ahead and execute according to the plan. Now, what can go wrong? all right. Well, what if i get the estimate wrong? So select star from orders where product i.D. Equals 10, and i Estimate one row. But i got a million rows. Prior to adaptive query processing, it’s tough. You have this plan. That’s catched. If i have a bad estimate, this is what we use to do the end-to-end estimation. We don’t change the assumptions. We still use the one-row assumptions through the end of The execution, and if performance suffers, so be it. We don’t have any mechanisms leading up to 2017 that allowed us to adapt. Any questions on that so far? this is — yeah, yeah. [ Inaudible question ]>>the question is where the Query is chargeable. If you have a predicate with a Supporting index used based on the predicate. Product i.D. Equal 10. If i have an index on product I.D., That means that i have an index and statistics associated With the index. The statistics show a data Distribution that reflects the value that there’s one row or There’s a million rows, okay? so why would this be wrong? Why would we show one but actually have one million? This is where the developer skills of query tuning come into Play. There’s lots of different Reasons why this could happen. Pinpointing it quickly is Definitely helpful when you’re in a time crunch and optimize on the fly. Common reasons. Your statistics are missing. I don’t have any statistics on product i.D. I have to make a guess. The optimizer doesn’t have an Index or stats. Let’s say you disabled auto Creation of stats. So optimizer has to make a Guess, so we have different fixed guesses in absence of that Information. Stail. You have so much through-put you have an order system. 1 Million products got purchased recently but aren’t reflected in statistics. When he execute that query, we deal with old assumptions. Also sampling rate. The bigger the table, the more Jagged the data distributions. Maybe i have product i.D.1 With One road and 10 has a million rows. The more that we might miss, Interesting distributions epful for query performance. The fact it’s not in the statistics, we have a plan that Doesn’t reflect reality. Bad parameter sniffing is the Common term. How many have heard of this? good. So i’ll show you an example of parameter sensitivity, but i’ll Take an example of — i have a store procedure. It just got developed. Somebody jumps in, and they use Parameters for that store procedure that are totally atypical. Nobody is going to use these parameters values. They merely test it and the plan is cached and based on atypical parameters. Everybody else has the typical meters and it has a different data distribution and Performance suffers. That’s one flavor of parameter Sensitivity. Then out of model constructs. This is a fancy way to say there are things beyond the select Star where having a group by a relational world where we don’t Have a good technique or good visibility to the estimated number of rows. We call them out of model where we don’t handle them today, or They might have a level of complexity where they aren’t Adequately handled ever. Example i’ll give for today is Multi-statement table valued functions. So we have this construct of T-sql functions around for a long time. We talk about how they’re used. I show you firsthand before 2017 what the behavior is to use them. The last thing is assumptions. Let’s say i have an order table Again, and i’m selecting where state equals minnesota and city Equals minneapolis. So i have two different claw Clauses and predicated. We know that city is correlated With state. Prior to — in an older version Of the estimation model, we did not know that in absence of Multiple column statistics. If you had multiple predicates, Another one is make and model of a car, vw golf, two different Columns. We assume they’re totally independent. We underestimate the number of rows that flow thu because Nobody says these two columns are correlated again in the Absence of multi-column statistics. These are the most common. These are the ones that are the Most frequent. Why should you care? All right. So i get a bad plan. If i ask for too much or too little, of course the issue is performance. So query performance, your query performance or other people. Excessive resource utilization, so i’m asking for too much cpu, Memory, i/o, spilling to disk. So i only ask for 10 meg and Need 1 gig of ram for my memory grant. So now i have to actually filter this, which is significantly Slower than asking for the right amount in the first place. Reduced through-put. Let’s say you give me one gig of Memory of a memory grant from the request. I only use 20 meg of it. Now, i’m running quickly, so i Have no performance issue, but i have 20 other con current Requests asking for memory and they have to wait or be gated Behind my request. A very large request and a Couple doing that can cause unnatural gating. Nobody is using the memories and it’s wasted, but your Through-put suffers. It happens more often than not actually. Tsql refactors. A hedge fund wut them in. There’s the virtual functions. It’s re-useable and you can Encapsulate that code. You can modify business Assumptions, but it’s a black box leading up to 2017 from a Query opt sflags perspective. Older versions, we assume one Row flows from it. What if you have a million rows? All right. So into a description of what we Introduced in 2017 and again think of this as laying the base Of different techniques adding to the engine rather than a new Set of assumptions, we try to actually adapt during runtime Execution. First issue, interleave Execution for msdefs. I know it’s a mouthful. If i select from a function, if i have millions of rows, we’re Still going to assume one row is flowing from the junction. If you perform different operations, the downstream side Effect of that underestimate can be very painful. And i mentioned once we have a plan so i optimize and execute Thashgsz no going back. I estimated one row and i Already seen one million rows flow through, we don’t Reoptimize that plan. So that’s the first time we Broke that boundzary between optimize and execute. What we do for the first version is optimize up to the point we See the function. Then we will go ahead and Execute the mstvf, which we were already doing. We executed it behind the scenes. We use the Cardinality and re-enter and optimize based on The actual runtime count. There are other opportunities in The engine. Things like table variables and Other constructs that have the same fixed nature, and some are More amicable applicable and Stop and get it to go. It’s a perfect conjunction of You have a lot of people still using them today, and they’re Problematic from a performance perspective. So that’s one of the features we Added again last year. Second one is memory grant feedback. Here’s a problem. It can go both ways. I’m overestimating how much memory i need, so i think i need One gig, and i use 20 meg or conversely i didn’t ask for Another, and now i’m spilling to disk. What we will do is we will Learn, based on executions. If i execute and spill to disk, We will register an event that says you spilled a disk. So add memory. Second execution will add memory Enough to cover the spill and a buffer. If i’m wasting memory, more than 50 We will reduce that memory Foot proof. If it needs a couple of Adjustments we’ll do that. It can consecutive executions And we can keep improving and most times it gets improved on The second execution, but this is the first time that we’re Learning, and then feeding that in as part of the process. Question.>>Is this all turned on by default?>>The question is, is this all turned on by default? Thank you. That is turned on under Compatibility level 140, which is for 2017. That’s an absolute must have, because we made this agreement If you are upgrading, your plans shouldn’t regress due to any Kind of scenario or improvement that we added. There should be no plan regressions, and so we put in This compatibility level as that contract saying, hey, if you go To 140, you’re opening your feature set up to new features. One thing stepping back a little further. No matter what because of the unique set of characteristics Across customers, data sizes and schema variations, we expect Most plans to improve and there will be a regression and it will happen. We have to have that locked behind a compatibility level. Thank you for asking that. 140 Is going to enable Everything we’re talking about today. Now, if you created a new database and sql db today, the Default is 140. If you have sql server 2017, the Default is 140. You can always bump it down if You have to wait for compatibility testing as well. That’s perfectly supportable. Okay. And then third feature and then I’ll show you demos, so adaptive joints. So you’re going to notice that Each of these are at different scopes. The first example was one an Actual optimization. The first time i optimized, we Can go in and out of optimization and execution. The second example of memory Grant feedback was for repeating workloads, learn, and this third Is the operator level. At the operator level adapt, Okay? so what operator are we talking About? if i have a low row flow Estimate, so if i think a low number of rows flow through a Tree and you join tables, you have a couple of algorithm choices. In this example you see two different degrees with the Semantic equivalent. One has a hashtag and one is a Nested loop. Nested loop is more what you Want for low road flow because there’s a lower start-up ko. You have that start up cost for memory but it’s optimal for Large row sets. What if we’re wrong? Then you can have it in an appropriate joint ail goe rilt. For the time time we added one operator that behind the scenes Is able to switch techniques during execution. Keep in mind, there is one compiled plan and then one or More adaptive joints within the plan depending on what you do if You join a lot of different tables. My executing low versus high, Same plan and we’re not doing a new compilation for this. We will look at an intersection of costing. So if you look at the x-axis, You see its rows. Y, you see it’s cost. No pink. It’s the lower number of rows, so there’s no start-up cost. If i do nested loop there’s minimal memory required. Add a certain number of rows it becomes more costly than the alternative. You see this line where we have the higher hash join line. There’s a higher initial cost, but then it’s less costly over time. That intersection, what we will do is based on that particular Plan and those particular tables, we look for that Intersection and that operating looks for the points where we Reach that threshold. Then if we exceed the threshold, We do hash joint operation. If we don’t exceed it, you use The net operation. The first time that we’ve had an Operator that is it it for the same cash plan switch strategies. Okay. I’ll show a demo on this. I think i’m — am i 7? here we go. I am. Okay. How many have used multi-state Table valued functions before? okay. All right. We see a lot of them out in the wild. Again, the virtue is understood. I’m going to start off on compatibility level 130. The aqp features aren’t enabled. This is the br state, and i’m Going to include the actual execution plan behind the Scenes. I have this wide world importers Dw database that i use. Just like with the examples, This is all on github as well, equivalent demos. I’m selecting from this fact order table, and i’m joining to Mstvf, my function right here. It takes in a few info meters. It says based on an event like a mild recession, show me what Impact that would have had on order quantity. Here was the before and then here’s the what-if scenario. So i’m joining to it like i would a table. I’m including the actual execution plan. This is the legacy behavior. We get the results back in three seconds. If you look here, even if you’re Not an expert at looking at query execution plans or haven’t Looked before, this is an area where if you know the query that You need to tune, looking at an actual execution plan is Absolutely critical. One thing that standing out Right away is i see little warning symbols by some operations. If you see warning symbols, that’s one pattern to Immediately check out. If i hover over this operator, We see, okay, the warning is operator use 10 to spill. My query for whatever reason spilled to disk. It didn’t have enough memory to accumulate the number of rows That flowed through. If i look at the actual output From what if outlier event quantity. I can hover over the operator, and you can see the estimation. The estimated number of row is 100. That’s the fixed guess. Actual inform rows 231,000. This is the case since functions Were introduced in 2005 maybe. It’s been a long time. This is the behavior. You won’t notice it u a low row Count, but when you get to larger role counts and joint it To other operations, performance can suffer, That’s our plan before. We enable 140, and that’s the Only change i make. This is the emphasis for this Set of features as well where we want to make the existing code Run faster, all right? so i rerun the query that i just Ran before. Again, the only difference being That i’m in 140. Then if we look at the execution Plan, a totally different shape. Operations got moved around Because bigger tables should be on the probe side of the hash. There’s all kinds of reasons to do order of operations properly If you have several hundred thousand rows versus just a few. Just check this out. Notice there’s no warnings. We didn’t spill to disk. If we look at the Table scan for what if, the estimated number of Rows is 231,000 and the row is 231,000. You didn’t have to do anything other than bump up the Compatibility level, and it’s important to be wear this is how It works and why it’s happening. All right. Now, let’s show memory grant feedback. So i’m going to show a couple of examples. I show an example of wasted memory and one of spilling. In this case we start off right away with 140. I have this query from tpch, and it’s based on query 12, which is Hike a shipping modes and order priority query. It’s the actual action plan, and we execute it. I chose this one because it asks for too many memory. I know for a fact if i hover over this select, i see a Warning down here. I see grant size is a half a gig In kb, but that’s half a gig and used to 17 meg. So we’re wasting a ton of memory. I asked for too many. You saw that query, it ran quickly. No frob for me. The time you have a concurrent set of workloads or requests or They might overestimate what they need, you start to gate. I mentioned at the very beginning of this particular Section, my two techniques were Weights and queues technology and cardinality. If something is waiting on a resource, you can see it. In this example if i start gating, you see a resource Before the wait. That’s the system where you date On memory. That could be a legitimate Memory request. Maybe you need it and need a Bigger system or need to rearrange. Or it could be i tint use the Memory and i cause gating with other concurrent requests. So given this, let’s go ahead and execute. All right. The warning symbol went away, so That was my second execution. If i look here, memory grant Sides is now 26 meg. What happened? i wasted more than 50 memory. We log that internally. You goed, if you execute this Again, adjust the memory downward by thele amount you Waste plus a buffer for safety. Again, first time that we Introduce something we learn based on excuse count. This helps with repeating workloads. There are a lot of scenarios where you’re one and done. This won’t help you with the adcock but helps with anything That’s executed more than one time. All right? Now, let’s look at the spill scenario. So i’m going to use wide world Importers dw, and again, under 140. I da the classic sensitive scenario. It takes an input pair measure Of lynnian i.D. And i say oshgs, this equals this key that i send In through the store procedure. I’m joins a couple of tables, And that’s the only thing i ask the end user to provide. Let’s make sure that this is created. I’m going to do that scenario where i’m the tester. Somebody says, can you test this before this goes live in production? They say sure. Let’s say they have no idea About proper lineage keys. I don’t know what a good testing Key should be, so i pick a key of eight. So they cans could you tell it. No rows flow back. The tester says, it looks good and rain without error. If you look at the execution plan, you’re going to see little Symptoms like this, like the skinny line. That shows the row flow. If it’s skinny we’re assume a Small number of rows are flowing through. The problem is if i execute with A representative, like a typical meter value, it’s going to run Along time, and it’s performance will suffer because that person That just tested it with parameter value 8 just caused That plan to be cashed and now it’s sitting there waiting for People to come in. The problem is that plan doesn’t Represent the typical data distribution. This is one variation of Parameter sniffing. So what are the consequences? I ran it again with a good parameter and i got back 231,000. 231,000 Rows. If you look, we’re just littered With warnings, okay? so we have a sort warning and Hash hat warning. If you look at sort, you see the Actual depth of that sort is quite high. So the number of times we went back to disk. We got it really wrong and have To go back to disk to undo our misestimation. All right. What do you do if i execute the Story procedure? a second time memory gram Feedback kicked off behind the scenes. I don’t have any memories. The plan shaped itself was Appropriate, but we didn’t ask for the bad memory because of The bad plan that got cached. All right. Then last example for the aqp Features, so i’m going to use the tcph database again. I base it on county 18 which is large volume customer. Again, for adaptive joint just to remind you the idea is we Have one compiled plan. So the first time i execute This, i’m going to have a compiled plan that has an Operator in it that can go with two different joint strategies, So we don’t need to recompile in order to benefit from it. If i execute it and i’ll do it based on value 314, which i hope To know returns 85 row. If i fw to the execution plan, You sigh this new operator. A few things i want to point Out. You have adaptive joint as our Operator and also you have this outer branch, so this could be The driver of your nested loop or the build phase of your hash Joint. You have the second branch, Which is your probe face and should be used as a happen Operation and you have this third branch for nestle look. In terms of visual caters, you see thick and thin low lines, But you don’t have to count on those. You can hover over the adaptive joint. You see a few to bring to your attention. Estimated jointed time. When we compiled we estimated it would be nested loop, and when It ran, it should have been nested loop that’s what we used. Why did we make that decision? we adapter threshold rows. Remember that intshgz of two algorithms and the cost and the Rows, that ripts the intersection for this particular Dataset. At 1219 rows, if we exceed we go Mash and beneath it loop. If you look at actual number of Rows 599 rows are under the threshold, we use nested loop. Let’s goed and go ahead and Execute on a parameter far more than in terms of number of rows. We don’t compile a new plan here oor play any tricks around the Actual mechanics of it or add hints. It’s one plan accommodating two Differential goe rhythms from one operator. If he with look the tree now, Notice that this line is thicker. This is the build phase of the Hash joint operation. Now this nested look inside is Now skinny. However, if we go over adaptive Joint, this is an example where adaptive joints are beneficial because we estimated nested Loop. We were wrong. We went with a hash match, and We were wrong about our adaptive threshold is 1200 rules but our Actual number of rows is 458,000. So we wanted to change that Algorithm. Any questions on this? yeah. [ Inaudible question ]>>the question is do we have Merge hints? there’s a merge joint hint Around for a long time, but we didn’t choose merge as an Algorithm choice for this particular feature. Whether or not that gets on the roadmap, we’ll see. He we notice most of the extremes are between nested loop And hash joints. With merge typically from an Optimizer perspective if you have presorted inputs already, We make a sgood decision around that. So it’s based on what we were Seeing for customer usage patterns on average. [ Inaudible question ]>>if you ran it with a Query hint, they rule all. If you do that, you tie the hand of the optimizer. That’s one of the dangers of the query hints. What might be a truth ten years so by the way, that is one of The markers. If i see a lot of hins sprinkled Shgs i either deal with someone very, very smart or someone that Went to stackoverflow and worked, but it didn’t stand the Test of time when data distributions change. Any other questions on this? Yeah. Okay. You here. [ Inaudible question ]>>i’ll reframe is a little bit, And if i didn’t get the nature of it, stop by and we’ll go Through it. Can we do a seek versus scan adaptively? Is that what i hear? can you switch actual seeking Strategies based on — one example is product i.D. Equal 10, And a million rows pass through, we would want to do a Scan instead of a seek. The answer is no, we don’t have That today. It’s an interesting space, though. There’s a lot of different — what we focus on are Lower-risk/higher reward. The seek versus stand is a space Being discussed. One more question. Yes. [ Inaudible question ]>>could you repeat that one more time? I’m sorry. [ Inaudible question ]>>By the way, feel free to come back here as well if i Misinterpret it. Does the query execution plan We’re showing show realities around the characteristics of Your storage and cpo and on so on? it’s a good question because a Lot of myths spawned over the year. When you see estimated cpu and I/o in the plan, that its based on a machine long gone from 1997. That machine had specific assumptions. For example, here’s how much a Serial i-o is versus a singleton lookup. So we had these different costing functions, which at that Moment in time meant something. Now they’re a unitless measure. It doesn’t mean we’re invalued, because the relative values are still important. If somebody looks there and says estimated cpu is 1.2, It stent Mean anything. It just means that of the Different costing alternatives, we chose the one that was Cheapest based on the algorithms that were selected that you saw On that tree. Okay? one more, and then we’re going to kevin. [ Inaudible question ]>>the question is i have an Oscillating scenario. Sometimes it’s one row or Returns 1 million. There’s no right — it’s truly The meter sensitivity where no one size fits all. In that scenario we disable behind the scenes. If there’s too much oscillation, we disable it. In terms of visibility, we see that with the next event. After a certain number we stop doing memory grant feedback. If we go up and down, we stop. All right. We move to kevin now.>>All right. So i’m going to talk a bit about automatic tuning. We’re starting to move into the recommend of what we call Excellent query processing. Joe is talking about daptive Query processing where we adwhapt we do based on what we See during execution. We realize that there’s another Whole set of strategies we could use that weren’t adapting to Anything but make the query process smarter. The overriding theme to all of This is we’re looking at ways to make things just go faster. To take the work off you guys, so you upgrade and things just Go faster and smoother. That’s the goal here. So with automatic tuning, who is familiar with the query store? So query store is the flight data recorder for queries. What it does is tracks for a given query hash. So it has a unique hash. For that querrey Query hash we track it over time and the Performance of various plans. You can go into a ui and say for Query number 35 or with this text, we had this plan and that plan. That plan was slow, and this plan was fast. Whatever. What automatic tuning is doing is taking advantage of that and Taking it a step further. So queries get recompiled for a Lot of reasons. Different parameters. Parameter sniffing is a classic. The new plan might not be Optimal for the general case. That’s the example that joe Showed. Overaall performance can degrade Abruptly if you get the wrong plan compiled and it can take Time to figure out what went wrong. The solution is the query store Has all the information we need to fix that, because it can tell You, okay, for this query we had two plans, and the plan you’re Using right now takes ten times the cpu of the other plan. So it gives you the ability to to go in and force the plan faster. That’s a huge benefit, but we can do better. What automatic tuning does is taking that whole analysis and Tuning operation and automates it. The idea is you have a performance problem, and we’re Going to be able to automatically tune it or fix the Performance problem without you doing any tuning. Again, this runs in three different modes. You can turn it all off and disable query store, do Everything like you did for the last 20 years. Or you can turn on the query store without any Automatic tuning enabled. That tells you we have the different Performance problem and a regression because of a New query plan. What this bar graph is showing The speed of the execution time. With automatic query tuning, the Query store will tell you you have a regression. This is what i think was the problem. With automatic query tuning you can get a recommendation that Says this is the problem. You had a regression and this is the problem. It will give you a snippet of script to correct it. So you can correct it by running a piece of script. Once you gain confidence in that, you can turn on a switch That says automatic force the last known good plan. If there’s a plan change that got worse, find the last plan That was good as the plan being Used. Uch the execution to go along Regresses the. The red bars could you tell your Execution time went way up. It’s the syntax for forcing it. It can be helpful for parameter sniffing situations you have, Dmv tuning recommendations, and it puts out a json bundle that Tells you what went wrong. Automatic tuning, the system Just corrects. So let’s see what that looks like. This looks like a little different ui than you’re used to. This is sql operation studio, which is a new ui. I’m using it because it a really good way to chart Results over time, which makes a good visual for this particular Demonstration. This demo uses the wide world Importers database again. What we’ve got is one script Called report, and that just runs a report query in a loop Using a particular parameter. It’s a report where we use the Number 7. We have the questions and the Perf collector and it just caches them with a time stamp in the temp table. That spins in a loop collecting how many batch requests per Second we’re able to run. Batch request gets you the results. We have a regression script that introduces a Regression by using parameter sniffing. We execute with a nontypical parameter, and when we go back To the report query, you have a compiled plan that’s the wrong One. Let’s see what that looks like. We’re going to start up with the report query. That’s going to start running. That’s spinning and that starts Running. Let that settle for a little Bit, and execute the batch requests. We can look at that. It has various different kinds Of charts. Oh, come on. It’s a time series and we have a performance level established. It will kaudz cause a regression. All this is up in github, so you can look at it. We see what the regression script did to our performance. We should run it again and visualize it as a time series. Thoolg it vm florida state there It is. Let me hit the regression Script. So he see the performance drop Off the cliff. Something bad happened. How many had a production system suddenly tank performance? How many had it happen in the middle of the night? This makes for a bad night. At this point phones ring and You have to wake people up and figure out what happened. People are not happy. So we have a script here that will look at the recommendations. So this is the recommendation dmv. So you have what happened, the Average cpu time changed from 78 milliseconds to 309 Milliseconds. That might be a problem. State transition reason. Automatic tuning options not Enabled and the script to fix it is to force the right query Plan. So if we go here Turn on — oh, come on. Okay. I’ll kill that one. I’ll start everything over here and stop the report query. Stop the perf collector. You have an initial script that Resets everything to the state it was in. That’s interesting. Okay. There we go. Okay. This should have cached. I don’t know what happened. There we go. Now we reset everything. Start up the workload. Start up the perf collector. Look at our performance levels Where we run this. Make this a time series. So we’ve only got a few dat points at this point, but you See the performance level we’re getting. We introduce and turn on auto tuning here. That sets this parameter we saw before. So now when we hit this regression, we should fix Ourselves. Okay? so we just hit a re regression. We look the athe performance levels. So you saw it fall off a cliff, and we forced the last known Good plan and it comes right back up. In this demo it takes about 10 seconds teen the time the Performance fell off a cliff until we recognized that there Was an issue. There’s a way to fix it by Forcing a good plan and apply that fix. So 10 seconds is not enough time to even ring the phone and wake Anybody up let alone having too deal with. You see the log in the morning that said, oh, there was a blip During the night, and you can investigate it or not. The main thing is nobody gets woken up in the middle of the Night. This is why i really like this feature. Beyond 2017, what’s happening? as joe mentioned, we have Features in development going beyond prosing. We look at why we have performance problems. So constructs they can’t accurately estimate and looking At way to make it perform correctly so the optimizer gets A good view. Lots of different strategies we’re working on. We will be introducing them in azure sql database as they’re Available, so as soon as they come out in a public preview in Azure sql database, we will blog about it so you can read about It this and experience them. They’ll be in the next iteration Of sql server on-premises when that ships. [ Inaudible question ]>>we release bits into azure About once a month. So the first stage that will Happen is it will put it into a public preview where it will Announce it, but it’s not to be used for production yet. And let people get experience with it, let us find out if Anything goes wrong. When we’re confident, we’re make It ga, which means you’re okay to use it in production. And watch this space. So blogs for the sql server Storage engine is the blog that the sql engine team which is all The same source code. At this level there’s no Distinction between the two engines. So this is where the team blogs and this is where you see the Announcements when we light up new features. Any questions? [ inaudible question ]>>So missing cluster indices, on-premises you get a warning. With azure it will give you a more fine grained idea and in Automatic mode it creates the index for you on the fly. It will continue to monitor performance and if it Understands from continuing to monitor didn’t help things, it Will back it back out so you don’t have to maintain it.>>By the way, on the next two slides you’ll get this with the Deck. We have graph recommended links, So if you want to follow-through and also intelligent qp and Adaptive qp. There was a question over here. [ Inaudible question ]>>who owns the responsibility Of maintaining indices? this is very timely and i had This question in a booth. That talks about creating Indices and dropping indices. It’s not maintaining those Indices for you. With that said rgsz you want to be smart. You don’t want to rebuild a bunch of indices every single Day because you think it’s helpful. You don’t want to use the Resources unnecessarily. My recommendation is take a look At at the (indiscernible) or workloads that use that index. If you see there’s a lot of fragmentation and then that Leads to higher i/o operations so you consume more i/o, add in Your own may not johns job is go. If we do it on your behalf it Would be inappropriate. Some scenarios where your Indices should be left alone and where the fragmentation doesn’t matter. It’s still in the realm of something you choose to apply, But i would say do so intelligently. Do so with looking at dmvs and Actual run-time characteristics and then history and not just Indiscriminate naturely updates everything. Okay? any other questions? We’ll take one more and anything else you guys can come up to the stage. I saw one here and bonus question there. All right. [ Inaudible question ]>>the axis on the graph is Batch request per sec. It’s the level of performance. How many batches per second. That’s the vertical access. The hour sglont access was timed. [ Inaudible question ]>>so this was how fast can you Run this report in.>>It was the cpu Characteristics associated with that request of execution, and we track that. So it’s looking at whether there was a time difference between those two. Between the prior and the current running average.>>Since it was a serial execution, the less time each Execution takes, the more we get in per second.>>Okay. There was one bonus question, and then we’ll be done. [ Inaudible question ]>>there are some Performance optimizations we have produced. The ones that i showed were very short just so its easy to Understand. If they are very long and the Pattern canning as long as you can, we introduce certain Optimizations there. So let’s say if you are Searching through a deep hierarchy, your match pattern Will be very long, and then you get some implicit optimizations we have introduced. So your queries in those cases run faster than a relational Query. Besides that for the next Version, we are introducing certain features which will have Some built-in optimizations for graph queries. That’s the benefit. If you adapt it now, you get it In the future, too.>>So please complete evals. That’s how we learn and get better as speakers and Presenting even better sessions next time. Thank you.

Leave a Reply

Your email address will not be published. Required fields are marked *