COUNT(*) versus COUNT(1) – the finale – Learning is not a spectator sport

COUNT(*) versus COUNT(1) – the finale

January 12, 2024

4–6 minutes

This is surely one of the most debated topics that has existed in SQL forums for the past 30 years., and you’re probably thinking “Surely this one has been done to death! Surely there is nothing more to day” which of course compels me to reference one of my all time favourite movies 😀

Man oh man did we burn some internet bandwidth and brain cycles on this topic. A quick google search comes up with a plethora of links, none of which are probably new to many of you 😀

https://www.linkedin.com/pulse/never-use-count-your-sql-count1-instead-myth-michael-papadopoulos

https://stackoverflow.com/questions/51460770/difference-between-count1-and-count-in-oracle

https://asktom.oracle.com/ords/asktom.search?tag=count-vs-count1-on-tables-with-clobs-and-etc

https://asktom.oracle.com/ords/asktom.search?tag=select-count1-how-it-works

https://asktom.oracle.com/ords/asktom.search?tag=difference-between-count1-and-count-200108

https://pretius.com/blog/difference-between-count-vs-count1-oracle/

https://www.linkedin.com/pulse/count-vs-count1-countcolumnname-sql-server-harsh-mehta

and many many more …To try paraphrase 30 years of debate into a few sentences, I think the timeline roughly went like this:

Early theories

“COUNT(*) must be slower because it looks at all columns” (which is wrong)
“COUNT(1) is faster because we don’t need to access columns” (which is wrong)

Later theories

“COUNT(primary key) is faster because its the only way to scan the index instead of the table” (which is wrong)

and the world finally came around the understanding that just about every database out there treats COUNT(*) as the same as COUNT(1) as the same as COUNT(primary key) as the same as COUNT(declaratively not null column) so it doesn’t really matter which you choose, and that we should stop arguing about it 😀

We also all settled on understanding that COUNT(nullable column) is fundamentally a different question to asking for a count of the rows in a result set

So we are done, right? We can finally put this whole issue to bed, right?

Time for another meme!

I’ve always been in camp “Use COUNT(*)” so the following demo is cut/pasted from my 19c instance with just a hint of smugness 😀. Lets create a materialized view that we want to be ON COMMIT FAST refresh but the definition also includes an aggregate.


SQL> create table emps
  2    ( empno primary key,
  3      deptno,
  4      ename,
  5      sal
  6    )   as
  7  select
  8     empno,
  9     deptno,
 10     ename,
 11     sal
 12  from scott.emp;

Table created.

SQL>
SQL> create materialized view log on emps
  2  with rowid, sequence, commit scn, primary key (sal, deptno)
  3  including new values;

Materialized view log created.

SQL>
SQL> create materialized view emps_mv
  2  refresh fast on commit as
  3  select deptno, max(sal) max_sal
  4  from emps
  5  group by deptno;

Materialized view created.

That looks good. No errors. But let us take a look at what the capabilities of the materialized view are:


SQL> exec dbms_mview.explain_mview('emps_mv')

PL/SQL procedure successfully completed.

SQL> select * from mv_capabilities_table
  2  @pr
==============================
STATEMENT_ID                  :
MVOWNER                       : MCDONAC
MVNAME                        : EMPS_MV
CAPABILITY_NAME               : REFRESH_COMPLETE
POSSIBLE                      : Y
RELATED_TEXT                  :
RELATED_NUM                   :
MSGNO                         :
MSGTXT                        :
SEQ                           : 1002
==============================
STATEMENT_ID                  :
MVOWNER                       : MCDONAC
MVNAME                        : EMPS_MV
CAPABILITY_NAME               : REFRESH_FAST
POSSIBLE                      : Y
RELATED_TEXT                  : SCN based refresh
RELATED_NUM                   :
MSGNO                         :
MSGTXT                        :
SEQ                           : 2003
==============================
STATEMENT_ID                  :
MVOWNER                       : MCDONAC
MVNAME                        : EMPS_MV
CAPABILITY_NAME               : REFRESH_FAST_AFTER_INSERT
POSSIBLE                      : Y
RELATED_TEXT                  :
RELATED_NUM                   :
MSGNO                         :
MSGTXT                        :
SEQ                           : 5006
==============================
STATEMENT_ID                  :
MVOWNER                       : MCDONAC
MVNAME                        : EMPS_MV
CAPABILITY_NAME               : REFRESH_FAST_AFTER_ONETAB_DML
POSSIBLE                      : N
RELATED_TEXT                  :
RELATED_NUM                   :
MSGNO                         : 2086
MSGTXT                        : materialized view uses MIN, MAX, ANY_VALUE, or BIT_*_AGG aggregate function
SEQ                           : 6007
==============================

You can see that we can do a fast refresh after inserting rows, but we can not do a fast refresh after general DML (update, delete, etc). If we peruse the documentation, you’ll see that in order to achieve this you need to have a COUNT aggregate in the view definition even if you do not need that data for your applications.

Let us assume that I am “Team COUNT(1)”, and recreate the materialized view.


SQL> drop materialized view emps_mv;

Materialized view dropped.

SQL> create materialized view emps_mv
  2  refresh fast on commit as
  3  select deptno, max(sal) max_sal, count(1) c
  4  from emps
  5  group by deptno;

Materialized view created.

SQL> exec dbms_mview.explain_mview('emps_mv')

PL/SQL procedure successfully completed.

SQL> select * from mv_capabilities_table
  2  @pr
==============================
STATEMENT_ID                  :
MVOWNER                       : MCDONAC
MVNAME                        : EMPS_MV
CAPABILITY_NAME               : REFRESH_COMPLETE
POSSIBLE                      : Y
RELATED_TEXT                  :
RELATED_NUM                   :
MSGNO                         :
MSGTXT                        :
SEQ                           : 1002
==============================
STATEMENT_ID                  :
MVOWNER                       : MCDONAC
MVNAME                        : EMPS_MV
CAPABILITY_NAME               : REFRESH_FAST
POSSIBLE                      : Y
RELATED_TEXT                  : SCN based refresh
RELATED_NUM                   :
MSGNO                         :
MSGTXT                        :
SEQ                           : 2003
==============================
STATEMENT_ID                  :
MVOWNER                       : MCDONAC
MVNAME                        : EMPS_MV
CAPABILITY_NAME               : REFRESH_FAST_AFTER_INSERT
POSSIBLE                      : Y
RELATED_TEXT                  :
RELATED_NUM                   :
MSGNO                         :
MSGTXT                        :
SEQ                           : 5006
==============================
STATEMENT_ID                  :
MVOWNER                       : MCDONAC
MVNAME                        : EMPS_MV
CAPABILITY_NAME               : REFRESH_FAST_AFTER_ONETAB_DML
POSSIBLE                      : N
RELATED_TEXT                  :
RELATED_NUM                   :
MSGNO                         : 2086
MSGTXT                        : materialized view uses MIN, MAX, ANY_VALUE, or BIT_*_AGG aggregate function
SEQ                           : 6007
==============================

Still no luck. It seems like that we cannot do general DML on such a view.

But what if I am “Team COUNT(*)”? Let us repeat the exercise.



SQL> drop materialized view emps_mv;

Materialized view dropped.

SQL> create materialized view emps_mv
  2  refresh fast on commit as
  3  select deptno, max(sal) max_sal, count(*) c
  4  from emps
  5  group by deptno;

Materialized view created.

SQL> exec dbms_mview.explain_mview('emps_mv')

PL/SQL procedure successfully completed.

SQL> select * from mv_capabilities_table
  2  @pr
==============================
STATEMENT_ID                  :
MVOWNER                       : MCDONAC
MVNAME                        : EMPS_MV
CAPABILITY_NAME               : REFRESH_COMPLETE
POSSIBLE                      : Y
RELATED_TEXT                  :
RELATED_NUM                   :
MSGNO                         :
MSGTXT                        :
SEQ                           : 1002
==============================
STATEMENT_ID                  :
MVOWNER                       : MCDONAC
MVNAME                        : EMPS_MV
CAPABILITY_NAME               : REFRESH_FAST
POSSIBLE                      : Y
RELATED_TEXT                  : SCN based refresh
RELATED_NUM                   :
MSGNO                         :
MSGTXT                        :
SEQ                           : 2003
==============================
STATEMENT_ID                  :
MVOWNER                       : MCDONAC
MVNAME                        : EMPS_MV
CAPABILITY_NAME               : REFRESH_FAST_AFTER_INSERT
POSSIBLE                      : Y
RELATED_TEXT                  :
RELATED_NUM                   :
MSGNO                         :
MSGTXT                        :
SEQ                           : 5006
==============================
STATEMENT_ID                  :
MVOWNER                       : MCDONAC
MVNAME                        : EMPS_MV
CAPABILITY_NAME               : REFRESH_FAST_AFTER_ANY_DML
POSSIBLE                      : Y
RELATED_TEXT                  :
RELATED_NUM                   :
MSGNO                         :
MSGTXT                        :
SEQ                           : 7007
==============================

And there you have it. Absolutely conclusive proof that COUNT(*) is the only phrase you should ever be using in all of your SQL code 😀

PS: Hopefully you can see that this post is a little tongue-in-cheek, but the serious part of the message is, always explore, investigate and test the features you’re using in the Oracle database. This post was motivated by a great post on materialized views by Martin. You should go check it out.

8 responses to “COUNT(*) versus COUNT(1) – the finale”

JAYT22

January 12, 2024 at 4:36 pm

Connor, no doubt you’ll correct me, but isn’t this just an isolated case in the code that assesses MV capabilities? Isn’t it equally possible that in some other relatively obscure edge case the database is looking for COUNT(1)?

The hunt begins…

(BTW, I’d probably always use COUNT(*) anyway, just playing devil’s advocate.)

Reply
1. Connor McDonald
  
  January 12, 2024 at 8:17 pm
  
  “… isn’t this just an isolated case in the code that assesses…”
  
  Hopefully you made it to my PS. This is all a little tongue-in-cheek
  
  Reply
2. Shashank Barki
  
  January 13, 2024 at 3:00 am
  
  Excellent. Thank you for sharing.
  
  I use count(1) also because it’s easy typing in keyboard to avoid more shift usage 😁
  
  Reply
elkjetpack1

January 13, 2024 at 6:02 am

BUT! If you’re not querying MViews, I don’t see anything wrong with using COUNT(1), which I’ve used in Oracle scripts for over 35 years! It’s never failed me yet.

Reply
Yuefu Li

January 15, 2024 at 3:12 pm

Excellent. Thanks

Reply
Joerg Sobottka

January 17, 2024 at 2:55 am

Is this also true for approx_count(1)/approx_count(*)? I have to test it. Or do you? 🙂

Reply
Mark Lancaster

February 10, 2024 at 7:50 am

Also worth mentioning count(1) is transformed into count(*) at least up to 11.2, as noted by Jonathan Lewis https://jonathanlewis.wordpress.com/2015/01/06/count-4/.

Would be interesting to see if this is still the case.

Reply
1. Connor McDonald
  
  February 12, 2024 at 12:05 am
  
  It is, but not in this particular instance
  
  Reply