Cyrus Stoller home about consulting

Analyzing my email usage patterns

Recently, I was curious about who sends me the most emails and who I send the most emails to. In this post, I’m going to walk through how to find answers to these types of questions.

I know there have been companies that have offered similar analyses, but I’ve always been hesitant to give someone I don’t know authorization to read all of my email. By working through Apple Mail instead of through Gmail or Exchange directly, you can be 100% sure that I will never have access to your private emails.

Getting started

First thing you need to do is find the sqlite database that Mail uses to populate your user interface. I’d recommend making a copy, so you don’t accidentally corrupt it. The sqlite database can be found at:

~/Library/Mail/V2/MailData/Envelope\ Index

I copied the database file to a folder on my desktop and queried it from the command line.

$ mkdir -p ~/Desktop/mail_experiment
$ cp ~/Library/Mail/V2/MailData/Envelope\ Index \
  ~/Desktop/mail_experiment/mail_database.sqlite
$ cd ~/Desktop/mail_experiment
$ sqlite3 mail_database.sqlite

If you’re less comfortable using the command line, you may want to try using this open source Database Browser for SQLite.

Questions I am interested in

Who sends me the most emails?

select count(*), a.address, a.comment
from messages m 
inner join addresses a 
on m.sender = a.ROWID 
group by a.address
order by count(*) desc  
limit 10;

Who do I send the most emails to?

select count(*), a.address, a.comment
from messages m
inner join recipients r
on m.ROWID = r.message_id
inner join addresses a
on a.ROWID = r.address_id
where m.sender in
  (select a.ROWID from addresses a where address like '%cstoller%')
group by a.address
order by count(*) desc
limit 10;

Who receives the most emails that I also receive?

In other words, if I receive an email (either via to or cc), which people are most likely to also be in the to or cc fields.

select count(*), a.address, a.comment
from recipients r1
inner join recipients r2
on r1.message_id = r2.message_id AND r1.address_id != r2.address_id
inner join addresses a
on r2.address_id = a.ROWID
where r1.address_id in
  (select a.ROWID from addresses a where address like '%cstoller%')
group by a.address
order by count(*) desc
limit 10;

Who sends me the most attachments?

select count(*), a.address, a.comment
from messages m 
inner join attachments at
on at.message_id = m.ROWID
inner join addresses a 
on m.sender = a.ROWID
group by a.address
order by count(*) desc  
limit 10;

How many emails do I get by day of the week?

select count(*), case cast (strftime('%w', m.date_sent) as integer)
  when 0 then 'Sunday'
  when 1 then 'Monday'
  when 2 then 'Tuesday'
  when 3 then 'Wednesday'
  when 4 then 'Thursday'
  when 5 then 'Friday'
  else 'Saturday' end as weekday
from messages m
where m.sender NOT in 
  (select a.ROWID from addresses a where address like '%cstoller%')
group by weekday
order by strftime('%w', m.date_sent) asc;

I receive the most emails on Tuesdays and Wednesdays. I expected to see more on Mondays.

Which days do I send the most emails?

select count(*), case cast (strftime('%w', m.date_sent) as integer)
  when 0 then 'Sunday'
  when 1 then 'Monday'
  when 2 then 'Tuesday'
  when 3 then 'Wednesday'
  when 4 then 'Thursday'
  when 5 then 'Friday'
  else 'Saturday' end as weekday
from messages m
where m.sender in 
  (select a.ROWID from addresses a where address like '%cstoller%')
group by weekday
order by strftime('%w', m.date_sent) asc;

For my email I was surprised that I send fewer emails on Thursday than Saturday. On the other hand, I was less surprised that I send the most emails on Monday and Friday.

How many emails do I get by time of day?

select count(*), strftime("%H", m.date_sent, 'unixepoch', 'localtime') hour
from messages m
where m.sender not in 
  (select a.ROWID from addresses a where address like '%cstoller%')
group by hour
order by hour asc;

How many email addresses have come through my inbox? Either in to or cc fields.

select count(*), case r.type
  when 0 then 'To'
  when 1 then 'Cc'
  else 'NA'end as mtype
from recipients r
where r.address_id in 
  (select a.ROWID from addresses a where address like '%cstoller%')
group by mtype
order by r.type asc;

Is this useful?

Unclear. This was a curiosity that was pretty easy to itch.


I want to give credit to two authors who inspired this blog post - Sam Pullara and J. S. Floyd.

Category Tutorial