<t>I have a question about Entity Framework query execution performance.<br/>
<br/>
Schema:<br/>
<br/>
I have a table structure like this:<br/>
<br/>
CREATE TABLE [dbo].[DataLogger]<br/>
(<br/>
[ID] [bigint] IDENTITY(1,1) NOT NULL,<br/>
[ProjectID] [bigint] NULL,<br/>
CONSTRAINT [PrimaryKey1] PRIMARY KEY CLUSTERED ( [ID] ASC )<br/>
)<br/>
<br/>
CREATE TABLE [dbo].[DCDistributionBox]<br/>
(<br/>
[ID] [bigint] IDENTITY(1,1) NOT NULL,<br/>
[DataLoggerID] [bigint] NOT NULL,<br/>
CONSTRAINT [PrimaryKey2] PRIMARY KEY CLUSTERED ( [ID] ASC )<br/>
)<br/>
<br/>
ALTER TABLE [dbo].[DCDistributionBox]<br/>
ADD CONSTRAINT [FK_DCDistributionBox_DataLogger] <br/>
FOREIGN KEY([DataLoggerID]) REFERENCES [dbo].[DataLogger] ([ID])<br/>
<br/>
CREATE TABLE [dbo].[DCString] <br/>
(<br/>
[ID] [bigint] IDENTITY(1,1) NOT NULL,<br/>
[DCDistributionBoxID] [bigint] NOT NULL,<br/>
[CurrentMPP] [decimal](18, 2) NULL,<br/>
CONSTRAINT [PrimaryKey3] PRIMARY KEY CLUSTERED ( [ID] ASC )<br/>
)<br/>
<br/>
ALTER TABLE [dbo].[DCString]<br/>
ADD CONSTRAINT [FK_DCString_DCDistributionBox] <br/>
FOREIGN KEY([DCDistributionBoxID]) REFERENCES [dbo].[DCDistributionBox] ([ID])<br/>
<br/>
CREATE TABLE [dbo].[StringData]<br/>
(<br/>
[DCStringID] [bigint] NOT NULL,<br/>
[TimeStamp] [datetime] NOT NULL,<br/>
[DCCurrent] [decimal](18, 2) NULL,<br/>
CONSTRAINT [PrimaryKey4] PRIMARY KEY CLUSTERED ( [TimeStamp] DESC, [DCStringID] ASC)<br/>
)<br/>
<br/>
CREATE NONCLUSTERED INDEX [TimeStamp_DCCurrent-NonClusteredIndex] <br/>
ON [dbo].[StringData] ([DCStringID] ASC, [TimeStamp] ASC)<br/>
INCLUDE ([DCCurrent])<br/>
<br/>
```<br/>
<br/>
Standard indexes on the foreign keys also exist (I don't want to list them all for space reasons).<br/>
<br/>
The `[StringData]` table as has following storage stats:<br/>
<br/>
- Data space: 26,901.86 MB<br/>
<br/>
- Row count: 131,827,749<br/>
<br/>
- Partitioned: true<br/>
<br/>
- Partition count: 62<br/>
<br/>
**Usage**:<br/>
<br/>
I now want to group the data in the `[StringData]` table and do some aggregation.<br/>
<br/>
I created an Entity Framework query (detailed infos to the query can be found [here](https://stackoverflow.com/questions/35436442)):<br/>
<br/>
```<br/>
var compareData = model.StringDatas<br/>
.AsNoTracking()<br/>
.Where(p => p.DCString.DCDistributionBox.DataLogger.ProjectID == projectID && p.TimeStamp >= fromDate && p.TimeStamp < tillDate)<br/>
.Select(d => new<br/>
{<br/>
TimeStamp = d.TimeStamp,<br/>
DCCurrentMpp = d.DCCurrent / d.DCString.CurrentMPP<br/>
})<br/>
.GroupBy(d => DbFunctions.AddMinutes(DateTime.MinValue, DbFunctions.DiffMinutes(DateTime.MinValue, d.TimeStamp) / minuteInterval * minuteInterval))<br/>
.Select(d => new<br/>
{<br/>
TimeStamp = d.Key,<br/>
DCCurrentMppMin = d.Min(v => v.DCCurrentMpp),<br/>
DCCurrentMppMax = d.Max(v => v.DCCurrentMpp),<br/>
DCCurrentMppAvg = d.Average(v => v.DCCurrentMpp),<br/>
DCCurrentMppStDev = DbFunctions.StandardDeviationP(d.Select(v => v.DCCurrentMpp))<br/>
})<br/>
.ToList();<br/>
<br/>
```<br/>
<br/>
The excecution timespan is exceptional long!?<br/>
<br/>
- Execution result: 92rows<br/>
<br/>
- Execution time: ~16000ms<br/>
<br/>
**Attempts**:<br/>
<br/>
I now took a look into the Entity Framework generated SQL query and looks like this:<br/>
<br/>
```<br/>
DECLARE @p__linq__4 DATETIME = 0;<br/>
DECLARE @p__linq__3 DATETIME = 0;<br/>
DECLARE @p__linq__5 INT = 15;<br/>
DECLARE @p__linq__6 INT = 15;<br/>
DECLARE @p__linq__0 BIGINT = 20827;<br/>
DECLARE @p__linq__1 DATETIME = '06.02.2016 00:00:00';<br/>
DECLARE @p__linq__2 DATETIME = '07.02.2016 00:00:00';<br/>
<br/>
SELECT <br/>
1 AS [C1], <br/>
[GroupBy1].[K1] AS [C2], <br/>
[GroupBy1].[A1] AS [C3], <br/>
[GroupBy1].[A2] AS [C4], <br/>
[GroupBy1].[A3] AS [C5], <br/>
[GroupBy1].[A4] AS [C6]<br/>
FROM ( SELECT <br/>
[Project1].[K1] AS [K1], <br/>
MIN([Project1].[A1]) AS [A1], <br/>
MAX([Project1].[A2]) AS [A2], <br/>
AVG([Project1].[A3]) AS [A3], <br/>
STDEVP([Project1].[A4]) AS [A4]<br/>
FROM ( SELECT <br/>
DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Project1].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3) AS [K1], <br/>
[Project1].[C1] AS [A1], <br/>
[Project1].[C1] AS [A2], <br/>
[Project1].[C1] AS [A3], <br/>
[Project1].[C1] AS [A4]<br/>
FROM ( SELECT <br/>
[Extent1].[TimeStamp] AS [TimeStamp], <br/>
[Extent1].[DCCurrent] / [Extent2].[CurrentMPP] AS [C1]<br/>
FROM [dbo].[StringData] AS [Extent1]<br/>
INNER JOIN [dbo].[DCString] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]<br/>
INNER JOIN [dbo].[DCDistributionBox] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]<br/>
INNER JOIN [dbo].[DataLogger] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]<br/>
WHERE (([Extent4].[ProjectID] = @p__linq__0) OR (([Extent4].[ProjectID] IS NULL) AND (@p__linq__0 IS NULL))) AND ([Extent1].[TimeStamp] >= @p__linq__1) AND ([Extent1].[TimeStamp] < @p__linq__2)<br/>
) AS [Project1]<br/>
) AS [Project1]<br/>
GROUP BY [K1]<br/>
) AS [GroupBy1]<br/>
<br/>
```<br/>
<br/>
I copied this SQL query into SSMS on the same machine, connected with same connection string as the Entity Framework.<br/>
<br/>
The result is a very much improved performance:<br/>
<br/>
- Execution result: 92rows<br/>
<br/>
- Execution time: 517ms<br/>
<br/>
I also do some loop runing test and the result is strange. The test looks like this<br/>
<br/>
```<br/>
for (int i = 0; i < 50; i++)<br/>
{<br/>
DateTime begin = DateTime.UtcNow;<br/>
<br/>
[...query...]<br/>
<br/>
TimeSpan excecutionTimeSpan = DateTime.UtcNow - begin;<br/>
Debug.WriteLine("{0}th run: {1}", i, excecutionTimeSpan.ToString());<br/>
}<br/>
<br/>
```<br/>
<br/>
The result is very different and looks random(?):<br/>
<br/>
```<br/>
0th run: 00:00:11.0618580<br/>
1th run: 00:00:11.3339467<br/>
2th run: 00:00:10.0000676<br/>
3th run: 00:00:10.1508140<br/>
4th run: 00:00:09.2041939<br/>
5th run: 00:00:07.6710321<br/>
6th run: 00:00:10.3386312<br/>
7th run: 00:00:17.3422765<br/>
8th run: 00:00:13.8620557<br/>
9th run: 00:00:14.9041528<br/>
10th run: 00:00:12.7772906<br/>
11th run: 00:00:17.0170235<br/>
12th run: 00:00:14.7773750<br/>
<br/>
```<br/>
<br/>
**Question**:<br/>
<br/>
Why is Entity Framework query execution so slow? The resulting row count is really low and the raw SQL query shows a very fast performance.<br/>
<br/>
**Update 1**:<br/>
<br/>
I take care that its not a MetaContext or Model creation delay. Some other queries are executed on the same Model instance right before with good performance.<br/>
<br/>
**Update 2** (related to the answer of @x0007me):<br/>
<br/>
Thanks for the hint but this can be eliminated by changing the model settings like this:<br/>
<br/>
```<br/>
modelContext.Configuration.UseDatabaseNullSemantics = true;<br/>
<br/>
```<br/>
<br/>
The EF generated SQL is now:<br/>
<br/>
```<br/>
SELECT <br/>
1 AS [C1], <br/>
[GroupBy1].[K1] AS [C2], <br/>
[GroupBy1].[A1] AS [C3], <br/>
[GroupBy1].[A2] AS [C4], <br/>
[GroupBy1].[A3] AS [C5], <br/>
[GroupBy1].[A4] AS [C6]<br/>
FROM ( SELECT <br/>
[Project1].[K1] AS [K1], <br/>
MIN([Project1].[A1]) AS [A1], <br/>
MAX([Project1].[A2]) AS [A2], <br/>
AVG([Project1].[A3]) AS [A3], <br/>
STDEVP([Project1].[A4]) AS [A4]<br/>
FROM ( SELECT <br/>
DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Project1].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3) AS [K1], <br/>
[Project1].[C1] AS [A1], <br/>
[Project1].[C1] AS [A2], <br/>
[Project1].[C1] AS [A3], <br/>
[Project1].[C1] AS [A4]<br/>
FROM ( SELECT <br/>
[Extent1].[TimeStamp] AS [TimeStamp], <br/>
[Extent1].[DCCurrent] / [Extent2].[CurrentMPP] AS [C1]<br/>
FROM [dbo].[StringData] AS [Extent1]<br/>
INNER JOIN [dbo].[DCString] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]<br/>
INNER JOIN [dbo].[DCDistributionBox] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]<br/>
INNER JOIN [dbo].[DataLogger] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]<br/>
WHERE ([Extent4].[ProjectID] = @p__linq__0) AND ([Extent1].[TimeStamp] >= @p__linq__1) AND ([Extent1].[TimeStamp] < @p__linq__2)<br/>
) AS [Project1]<br/>
) AS [Project1]<br/>
GROUP BY [K1]<br/>
) AS [GroupBy1]<br/>
<br/>
```<br/>
<br/>
So you can see the problem you described is now solved, but the execution time does not change.<br/>
<br/>
Also, as you can see in the schema and the raw execution time, I used optimized structure with high optimized indexer.<br/>
<br/>
**Update 3** (related to the answer of @Vladimir Baranov):<br/>
<br/>
I don't see why this can be related to query plan caching. Because in the MSDN is clearly descripted that the EF6 make use of query plan caching.<br/>
<br/>
A simple test proof that the huge excecution time differenz is not related to the query plan caching (phseudo code):<br/>
<br/>
```<br/>
using(var modelContext = new ModelContext())<br/>
{<br/>
modelContext.Query(); //1th run activates caching<br/>
<br/>
modelContext.Query(); //2th used cached plan<br/>
}<br/>
<br/>
```<br/>
<br/>
As the result, both queries run with the same excecution time.<br/>
<br/>
**Update 4** (related to the answer of @bubi):<br/>
<br/>
I tried to run the query that is generated by the EF as you descripted it:<br/>
<br/>
```<br/>
int result = model.Database.ExecuteSqlCommand(@"SELECT <br/>
1 AS [C1], <br/>
[GroupBy1].[K1] AS [C2], <br/>
[GroupBy1].[A1] AS [C3], <br/>
[GroupBy1].[A2] AS [C4], <br/>
[GroupBy1].[A3] AS [C5], <br/>
[GroupBy1].[A4] AS [C6]<br/>
FROM ( SELECT <br/>
[Project1].[K1] AS [K1], <br/>
MIN([Project1].[A1]) AS [A1], <br/>
MAX([Project1].[A2]) AS [A2], <br/>
AVG([Project1].[A3]) AS [A3], <br/>
STDEVP([Project1].[A4]) AS [A4]<br/>
FROM ( SELECT <br/>
DATEADD (minute, ((DATEDIFF (minute, 0, [Project1].[TimeStamp])) / @p__linq__5) * @p__linq__6, 0) AS [K1], <br/>
[Project1].[C1] AS [A1], <br/>
[Project1].[C1] AS [A2], <br/>
[Project1].[C1] AS [A3], <br/>
[Project1].[C1] AS [A4]<br/>
FROM ( SELECT <br/>
[Extent1].[TimeStamp] AS [TimeStamp], <br/>
[Extent1].[DCCurrent] / [Extent2].[CurrentMPP] AS [C1]<br/>
FROM [dbo].[StringData] AS [Extent1]<br/>
INNER JOIN [dbo].[DCString] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]<br/>
INNER JOIN [dbo].[DCDistributionBox] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]<br/>
INNER JOIN [dbo].[DataLogger] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]<br/>
WHERE ([Extent4].[ProjectID] = @p__linq__0) AND ([Extent1].[TimeStamp] >= @p__linq__1) AND ([Extent1].[TimeStamp] < @p__linq__2)<br/>
) AS [Project1]<br/>
) AS [Project1]<br/>
GROUP BY [K1]<br/>
) AS [GroupBy1]",<br/>
new SqlParameter("p__linq__0", 20827),<br/>
new SqlParameter("p__linq__1", fromDate),<br/>
new SqlParameter("p__linq__2", tillDate),<br/>
new SqlParameter("p__linq__5", 15),<br/>
new SqlParameter("p__linq__6", 15));<br/>
<br/>
```<br/>
<br/>
- Execution result: 92<br/>
<br/>
- Execution time: ~16000ms<br/>
<br/>
It took exact as long as the normal EF query!?<br/>
<br/>
**Update 5** (related to the answer of @vittore):<br/>
<br/>
I create a traced call tree, maybe it helps:<br/>
<br/>
**Update 6** (related to the answer of @usr):<br/>
<br/>
I created two showplan XML via SQL Server Profiler.<br/>
<br/>
[Fast run (SSMS).SQLPlan](http://pastebin.com/6We2u6Dg)<br/>
<br/>
[Slow run (EF).SQLPlan](http://pastebin.com/B2mty2R9)<br/>
<br/>
**Update 7** (related to the comments of @VladimirBaranov):<br/>
<br/>
I now run some more test case related to your comments.<br/>
<br/>
First I eleminate time taking order operations by using a new computed column and a matching INDEXER. This reduce the perfomance lag related to `DATEADD(MINUTE, DATEDIFF(MINUTE, 0, [TimeStamp] ) / 15* 15, 0)`. Detail for how and why you can find [here](https://stackoverflow.com/questions/35436442).<br/>
<br/>
The Result look s like this:<br/>
<br/>
Pure EntityFramework query:<br/>
<br/>
```<br/>
for (int i = 0; i < 3; i++)<br/>
{<br/>
DateTime begin = DateTime.UtcNow;<br/>
var result = model.StringDatas<br/>
.AsNoTracking()<br/>
.Where(p => p.DCString.DCDistributionBox.DataLogger.ProjectID == projectID && p.TimeStamp15Minutes >= fromDate && p.TimeStamp15Minutes < tillDate)<br/>
.Select(d => new<br/>
{<br/>
TimeStamp = d.TimeStamp15Minutes,<br/>
DCCurrentMpp = d.DCCurrent / d.DCString.CurrentMPP<br/>
})<br/>
.GroupBy(d => d.TimeStamp)<br/>
.Select(d => new<br/>
{<br/>
TimeStamp = d.Key,<br/>
DCCurrentMppMin = d.Min(v => v.DCCurrentMpp),<br/>
DCCurrentMppMax = d.Max(v => v.DCCurrentMpp),<br/>
DCCurrentMppAvg = d.Average(v => v.DCCurrentMpp),<br/>
DCCurrentMppStDev = DbFunctions.StandardDeviationP(d.Select(v => v.DCCurrentMpp))<br/>
})<br/>
.ToList();<br/>
<br/>
TimeSpan excecutionTimeSpan = DateTime.UtcNow - begin;<br/>
Debug.WriteLine("{0}th run pure EF: {1}", i, excecutionTimeSpan.ToString());<br/>
}<br/>
<br/>
```<br/>
<br/>
0th run pure EF: **00:00:12.6460624**<br/>
<br/>
1th run pure EF: **00:00:11.0258393**<br/>
<br/>
2th run pure EF: **00:00:08.4171044**<br/>
<br/>
I now used the EF generated SQL as a SQL query:<br/>
<br/>
```<br/>
for (int i = 0; i < 3; i++)<br/>
{<br/>
DateTime begin = DateTime.UtcNow;<br/>
int result = model.Database.ExecuteSqlCommand(@"SELECT <br/>
1 AS [C1], <br/>
[GroupBy1].[K1] AS [TimeStamp15Minutes], <br/>
[GroupBy1].[A1] AS [C2], <br/>
[GroupBy1].[A2] AS [C3], <br/>
[GroupBy1].[A3] AS [C4], <br/>
[GroupBy1].[A4] AS [C5]<br/>
FROM ( SELECT <br/>
[Project1].[TimeStamp15Minutes] AS [K1], <br/>
MIN([Project1].[C1]) AS [A1], <br/>
MAX([Project1].[C1]) AS [A2], <br/>
AVG([Project1].[C1]) AS [A3], <br/>
STDEVP([Project1].[C1]) AS [A4]<br/>
FROM ( SELECT <br/>
[Extent1].[TimeStamp15Minutes] AS [TimeStamp15Minutes], <br/>
[Extent1].[DCCurrent] / [Extent2].[CurrentMPP] AS [C1]<br/>
FROM [dbo].[StringData] AS [Extent1]<br/>
INNER JOIN [dbo].[DCString] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]<br/>
INNER JOIN [dbo].[DCDistributionBox] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]<br/>
INNER JOIN [dbo].[DataLogger] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]<br/>
WHERE ([Extent4].[ProjectID] = @p__linq__0) AND ([Extent1].[TimeStamp15Minutes] >= @p__linq__1) AND ([Extent1].[TimeStamp15Minutes] < @p__linq__2)<br/>
) AS [Project1]<br/>
GROUP BY [Project1].[TimeStamp15Minutes]<br/>
) AS [GroupBy1];",<br/>
new SqlParameter("p__linq__0", 20827),<br/>
new SqlParameter("p__linq__1", fromDate),<br/>
new SqlParameter("p__linq__2", tillDate));<br/>
<br/>
TimeSpan excecutionTimeSpan = DateTime.UtcNow - begin;<br/>
Debug.WriteLine("{0}th run: {1}", i, excecutionTimeSpan.ToString());<br/>
}<br/>
<br/>
```<br/>
<br/>
0th run: **00:00:00.8381200**<br/>
<br/>
1th run: **00:00:00.6920736**<br/>
<br/>
2th run: **00:00:00.7081006**<br/>
<br/>
and with `OPTION(RECOMPILE)`:<br/>
<br/>
```<br/>
for (int i = 0; i < 3; i++)<br/>
{<br/>
DateTime begin = DateTime.UtcNow;<br/>
int result = model.Database.ExecuteSqlCommand(@"SELECT <br/>
1 AS [C1], <br/>
[GroupBy1].[K1] AS [TimeStamp15Minutes], <br/>
[GroupBy1].[A1] AS [C2], <br/>
[GroupBy1].[A2] AS [C3], <br/>
[GroupBy1].[A3] AS [C4], <br/>
[GroupBy1].[A4] AS [C5]<br/>
FROM ( SELECT <br/>
[Project1].[TimeStamp15Minutes] AS [K1], <br/>
MIN([Project1].[C1]) AS [A1], <br/>
MAX([Project1].[C1]) AS [A2], <br/>
AVG([Project1].[C1]) AS [A3], <br/>
STDEVP([Project1].[C1]) AS [A4]<br/>
FROM ( SELECT <br/>
[Extent1].[TimeStamp15Minutes] AS [TimeStamp15Minutes], <br/>
[Extent1].[DCCurrent] / [Extent2].[CurrentMPP] AS [C1]<br/>
FROM [dbo].[StringData] AS [Extent1]<br/>
INNER JOIN [dbo].[DCString] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]<br/>
INNER JOIN [dbo].[DCDistributionBox] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]<br/>
INNER JOIN [dbo].[DataLogger] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]<br/>
WHERE ([Extent4].[ProjectID] = @p__linq__0) AND ([Extent1].[TimeStamp15Minutes] >= @p__linq__1) AND ([Extent1].[TimeStamp15Minutes] < @p__linq__2)<br/>
) AS [Project1]<br/>
GROUP BY [Project1].[TimeStamp15Minutes]<br/>
) AS [GroupBy1]<br/>
OPTION(RECOMPILE);",<br/>
new SqlParameter("p__linq__0", 20827),<br/>
new SqlParameter("p__linq__1", fromDate),<br/>
new SqlParameter("p__linq__2", tillDate));<br/>
<br/>
TimeSpan excecutionTimeSpan = DateTime.UtcNow - begin;<br/>
Debug.WriteLine("{0}th run: {1}", i, excecutionTimeSpan.ToString());<br/>
}<br/>
<br/>
```<br/>
<br/>
0th run with RECOMPILE: **00:00:00.8260932**<br/>
<br/>
1th run with RECOMPILE: **00:00:00.9139730**<br/>
<br/>
2th run with RECOMPILE: **00:00:01.0680665**<br/>
<br/>
Same SQL query excecuted in SSMS (without RECOMPILE):<br/>
<br/>
**00:00:01.105**<br/>
<br/>
Same SQL query excecuted in SSMS (with RECOMPILE):<br/>
<br/>
**00:00:00.902**<br/>
<br/>
I hope this are all values you needed.</t>