Is ContentQuery Join NOT a SQL Join? Poor Performance on Large Datasets

Topics: Core, Troubleshooting
Apr 2, 2014 at 7:04 PM
I have 150,000+ titles (i.e. TitleParts) inside of my database. When I go to create a new Projection it is failing (as in 20 second delay) at the following location:

Orchard.Projections.Drivers.ProjectionPartDriver - Editor Method - following code:
model.QueryRecordEntries = Services.ContentManager.Query<QueryPart>().Join<TitlePartRecord>().OrderBy(x => x.Title).List()
    .Select(x => new QueryRecordEntry {
        Id = x.Id,
        Name = x.Name,
        LayoutRecordEntries = x.Layouts.Select(l => new LayoutRecordEntry {
            Id = l.Id,
            Description = GetLayoutDescription(layouts, l)
        })
    });
I only have 3 queries so this should be very fast (if this is using a SQL INNER JOIN. However, if you use a SQL trace, you will observe the following SQL is run:
SELECT this_.Id as Id6942_3_
    , this_.Number as Number6942_3_
    , this_.Published as Published6942_3_
    , this_.Latest as Latest6942_3_
    , this_.Data as Data6942_3_
    , this_.ContentItemRecord_id as ContentI6_6942_3_
    , titlepartr1_.Id as Id6989_0_
    , titlepartr1_.Title as Title6989_0_
    , titlepartr1_.ContentItemRecord_id as ContentI3_6989_0_
    , contentite4_.Id as Id6941_1_
    , contentite4_.Data as Data6941_1_
    , contentite4_.ContentType_id as ContentT3_6941_1_
    , contenttyp5_.Id as Id6943_2_
    , contenttyp5_.Name as Name6943_2_ 
FROM OrchardPPC_Orchard_Framework_ContentItemVersionRecord this_ inner join 
    OrchardPPC_Title_TitlePartRecord titlepartr1_ on this_.Id=titlepartr1_.Id left outer join 
    OrchardPPC_Orchard_Framework_ContentItemRecord contentite4_ on this_.ContentItemRecord_id=contentite4_.Id left outer join 
    OrchardPPC_Orchard_Framework_ContentTypeRecord contenttyp5_ on contentite4_.ContentType_id=contenttyp5_.Id 
WHERE this_.Published = 1 
ORDER BY titlepartr1_.Title asc
Unless I am looking at my SQL Trace wrong, this means all 150,000+ rows of data are being returned to look at three titles. So, my observations are that the ContentQuery Join here:
Services.ContentManager.Query<QueryPart>().Join<TitlePartRecord>()
Is not making it to the database as a SQL join. Am I looking at this wrong? If not, is there a reason .Join cannot actually result in a SQL join? That is probably more of an NHibernate question.
Apr 2, 2014 at 10:31 PM
I wanted to better understand how the queries worked so I went ahead and removed the join so I just have this:
var queryParts = Services.ContentManager.Query<QueryPart>().List();
My assumption was that it would "magically" query the database for just those content items containing the QueryPart. Actually, it seems to me that in the past we had a ContentType definition that showed the associated parts. I assumed it would query on what contenttypes the part exists then query just those content types. However, it looks like that is now (or always was) Infoset so no direct SQL query would be possible.

So, the above call drives this query:
SELECT this_.Id as Id1749_2_, this_.Number as Number1749_2_, this_.Published as Published1749_2_, this_.Latest as Latest1749_2_, this_.Data as Data1749_2_, this_.ContentItemRecord_id as ContentI6_1749_2_, contentite2_.Id as Id1734_0_, contentite2_.Data as Data1734_0_, contentite2_.ContentType_id as ContentT3_1734_0_, contenttyp3_.Id as Id1761_1_, contenttyp3_.Name as Name1761_1_ 
FROM OrchardPPC_Orchard_Framework_ContentItemVersionRecord this_ left outer join 
    OrchardPPC_Orchard_Framework_ContentItemRecord contentite2_ on this_.ContentItemRecord_id=contentite2_.Id left outer join 
    OrchardPPC_Orchard_Framework_ContentTypeRecord contenttyp3_ on contentite2_.ContentType_id=contenttyp3_.Id 
WHERE this_.Published = 1
Which means that even without the join we are getting back all 150,000+ records. I am guessing at that point it iterates through all of these records to find those with the content part association (it is just a guess, haven't tracked the code enough).

So here is my concern, if the above is correct then the query is going to take longer and longer with more records. Maybe it really isn't a concern and something else is going on with my particular database, but it seems concerning. Thoughts welcome.
Coordinator
Apr 2, 2014 at 10:57 PM
What should happen is that if the part is a ContentPartRecord then it would join on a Record, thus limiting the results. But you point to an interesting fact, that using InfoSet we might have removed some of this information ... and then is that we are loading everything ? Crazy, I will look into it, as in this case we could filter the content types as you are mentioning.
Apr 2, 2014 at 11:15 PM
Thanks for looking into this Sebastien, as further confirmation that it isn't just my database (for my own sanity more than anything) I tried to create a new projection on another large database that has 279,000+ records and it takes minutes. On the live server it just times out (and bogs the system down of course).
Coordinator
Apr 3, 2014 at 12:16 AM
Edited Apr 3, 2014 at 12:17 AM
I start to understand what is happening. Try to add more load myself. Could you add an index on the Title field and retry ?
(this just applied to your first case, when creating a new query)
Apr 3, 2014 at 2:30 AM
That would help on the "Sorting" aspects of the query though I agree it wouldn't prevent it from first returning all of the records. I checked to see if I could index the "Title" column and the nvarchar(1024) is a bit too long (so guessing it would truncate it). Still took the index. I then did a full build of the index using:
UPDATE STATISTICS [dbo].[OrchardPPC_Title_TitlePartRecord] 
    WITH FULLSCAN
It is fascinating, you took my first query from around 30 seconds to around 2 seconds (tremendous improvement). However, the N+1 after that (on the 150,000+ record database) is crazy. In my profiler I have this:
exec sp_executesql N'SELECT contentite0_.Id as Id4455_0_, contentite0_.Data as Data4455_0_, contentite0_.ContentType_id as ContentT3_4455_0_ FROM OrchardPPC_Orchard_Framework_ContentItemRecord contentite0_ WHERE contentite0_.Id=@p0',N'@p0 int',@p0=9571
go
And it just keeps doing that for all the Id's (as in around 30,000 hits with that single query). As you might suspect, this one is still taking around a minute or more.

On the database with 279,000+ records it only takes a second for the FULL loading of the new projection page (so on that one, for some reason, the SQL server is not using iteration, not yet sure why). On this database, it is now working perfectly (that or there is some SQL server caching going on to make it run crazy fast - always possible, I will explore more).

Look forward to your thoughts.
Apr 3, 2014 at 2:51 AM
I figured out why my 279,000+ one is working so well, it only has 58 titles in the whole database! So, when I use the query an inner join is getting passed in on the title which means we are only dealing with 58 rows.

On the 150,000+ database there are 154,710 titles so we are getting all of those rows back when a new Projection is attempted to be made...
Coordinator
Apr 3, 2014 at 5:32 PM
I agree there are two different issues here. I have a script which can generate as many Pages as I want, this will create versions and titles also so this should be ok to see where the issue is ... will keep you posted as I progress on it.
Apr 3, 2014 at 8:24 PM
Edited Apr 3, 2014 at 8:25 PM
It just hit me that if we do this - note the extra .Join<QueryPartRecord>() :
model.QueryRecordEntries = Services.ContentManager.Query<QueryPart>().Join<QueryPartRecord>().Join<TitlePartRecord>().OrderBy(x => x.Title).List()
    .Select(x => new QueryRecordEntry {
        Id = x.Id,
        Name = x.Name,
        LayoutRecordEntries = x.Layouts.Select(l => new LayoutRecordEntry {
            Id = l.Id,
            Description = GetLayoutDescription(layouts, l)
        })
    });
Which it appears is the same as doing this:
model.QueryRecordEntries = Services.ContentManager.Query<QueryPart, QueryPartRecord>().Join<TitlePartRecord>().
Then the page loads in about a second. The extra join forces it to constrict itself (i.e. join in SQL) to the QueryPartRecord table (which only has a couple records). I don't know that this is a perfect solution (sometimes ContentParts don't have ContentPartRecords). Thought I would at least throw it out there for further consideration.
Coordinator
Apr 3, 2014 at 8:37 PM
Actions:
  • Use the record in this specific query
  • When no record is defined, use content type definitions to limit to types with this part
I have 250K items right now and the limiting factor is the order on ModifiedUtc, without ordering it's very fast, with it's 6 seconds to get a page of items
Marked as answer by jao28 on 4/3/2014 at 12:51 PM
Apr 3, 2014 at 8:49 PM
I think my Title is indexed so it is even faster than 6 seconds for me (though I don't really believe that indexing a title makes a lot of sense overall, good to know it is an option when really need to go custom to optimize performance).

I agree with your actions. Do you think this:
model.QueryRecordEntries = Services.ContentManager.Query<QueryPart, QueryPartRecord>().Join<TitlePartRecord>().
Could that make it's way into core? I don't see any negative for users who have few titles and there are a lot of advantages for those who do. The second bullet you present will have to be done on a case by case basis (unless you can think of a way to script it lower to automatically look up the types - that could get ugly though).

On a side note, I am getting a LOT of latency on SQL Azure compared to local SQL database. Even the execution plans are different (though the indexes are all built and the data is the same). Queries that take about a second locally coming in at 8 seconds on Azure. The above change will at least mean that creating a Projection doesn't crash though I am still seeking any further optimization for Azure - thoughts welcome or we can also drop this thread and move on since this is a bit off topic.
Coordinator
Apr 4, 2014 at 4:36 AM
I fixed the issue I think c.f. https://orchard.codeplex.com/discussions/535697

Can you try with the 1.x branch ? I have 1M content items (not records, real content items, Pages) and it's as fast as with few items.
Apr 4, 2014 at 3:35 PM
I will be able to pull in 1.x starting next week. I have 2 questions though to make sure I am testing correctly:
  1. Is 1.x still the path to follow for the most recent development? I am on it now but did not know if I had to jump to 1.8.x to get the latest
  2. I see no actual changes to the ProjectionPart Driver in your changeset: https://orchard.codeplex.com/SourceControl/changeset/dad8835196303d6bebcba9d4e40c34b96233c5cd Can I assume that you were able to fix it generically so I should undo my code change above which dragged in the QueryPartRecord?
Will let you know how it turns out.
Coordinator
Apr 4, 2014 at 5:39 PM
If you have been using 1.x for a long time you will need to apply a specific patch before updating to 1.8.
And yes, I'd like you to test without your local patch to see if it 's fixing this and the inner joins are not needed.
Apr 14, 2014 at 8:25 PM
Finally was able to update to 1.x. With the changes you made I no longer needed my local patch. Thanks much!
Coordinator
Apr 14, 2014 at 8:53 PM
Awesome, thanks for getting back.
May 8, 2014 at 10:09 PM
I hate to even rediscuss this topic, but any chance something changed in the source code? Possibly I was just wrong before and things were cached, but today the page above is back to taking 60+ seconds. I believe the major changes you made for the improvements were in "Orchard.Framework.ContentManagement.DefaultContentQuery.cs" and my file looks like 1.x version so I am not sure what is going on. Any thoughts? I don't believe any code reverted. Happy to jump on a Skype meeting if you want to see it. In the meantime, putting my join back in.
May 8, 2014 at 10:20 PM
For anyone keeping score at home, this (is the slow one):
model.QueryRecordEntries = Services.ContentManager.Query<QueryPart, QueryPartRecord>().Join<TitlePartRecord>()...
Is not the same performance as this (is the fast one):
model.QueryRecordEntries = Services.ContentManager.Query<QueryPart>().Join<QueryPartRecord>().Join<TitlePartRecord>()...
I mentioned above that I thought they were the same but not the case in practice. The second one is the only fast / join way while the first one doesn't seem to make any difference and is still really slow. Feedback welcome.
Aug 28, 2014 at 7:22 PM
Final update, as of Orchard 1.8.1 this Query method is very fast:
model.QueryRecordEntries = Services.ContentManager.Query<QueryPart, QueryPartRecord>().Join<TitlePartRecord>()...
However, NOT having the record is still extremely slow. I have put in a ticket here: https://orchard.codeplex.com/workitem/20909