Recently, I was curious about who sends me the most emails and who I send the most emails to. In this post, I’m going to walk through how to find answers to these types of questions.
I know there have been companies that have offered similar analyses, but I’ve always been hesitant to give someone I don’t know authorization to read all of my email. By working through Apple Mail instead of through Gmail or Exchange directly, you can be 100% sure that I will never have access to your private emails.
First thing you need to do is find the sqlite database that Mail uses to populate your user interface. I’d recommend making a copy, so you don’t accidentally corrupt it. The sqlite database can be found at:
~/Library/Mail/V2/MailData/Envelope\ Index
I copied the database file to a folder on my desktop and queried it from the command line.
$ mkdir -p ~/Desktop/mail_experiment
$ cp ~/Library/Mail/V2/MailData/Envelope\ Index \
~/Desktop/mail_experiment/mail_database.sqlite
$ cd ~/Desktop/mail_experiment
$ sqlite3 mail_database.sqlite
If you’re less comfortable using the command line, you may want to try using this open source Database Browser for SQLite.
select count(*), a.address, a.comment
from messages m
inner join addresses a
on m.sender = a.ROWID
group by a.address
order by count(*) desc
limit 10;
select count(*), a.address, a.comment
from messages m
inner join recipients r
on m.ROWID = r.message_id
inner join addresses a
on a.ROWID = r.address_id
where m.sender in
(select a.ROWID from addresses a where address like '%cstoller%')
group by a.address
order by count(*) desc
limit 10;
In other words, if I receive an email (either via to
or cc
), which people are most likely to also
be in the to
or cc
fields.
select count(*), a.address, a.comment
from recipients r1
inner join recipients r2
on r1.message_id = r2.message_id AND r1.address_id != r2.address_id
inner join addresses a
on r2.address_id = a.ROWID
where r1.address_id in
(select a.ROWID from addresses a where address like '%cstoller%')
group by a.address
order by count(*) desc
limit 10;
select count(*), a.address, a.comment
from messages m
inner join attachments at
on at.message_id = m.ROWID
inner join addresses a
on m.sender = a.ROWID
group by a.address
order by count(*) desc
limit 10;
select count(*), case cast (strftime('%w', m.date_sent) as integer)
when 0 then 'Sunday'
when 1 then 'Monday'
when 2 then 'Tuesday'
when 3 then 'Wednesday'
when 4 then 'Thursday'
when 5 then 'Friday'
else 'Saturday' end as weekday
from messages m
where m.sender NOT in
(select a.ROWID from addresses a where address like '%cstoller%')
group by weekday
order by strftime('%w', m.date_sent) asc;
I receive the most emails on Tuesdays and Wednesdays. I expected to see more on Mondays.
select count(*), case cast (strftime('%w', m.date_sent) as integer)
when 0 then 'Sunday'
when 1 then 'Monday'
when 2 then 'Tuesday'
when 3 then 'Wednesday'
when 4 then 'Thursday'
when 5 then 'Friday'
else 'Saturday' end as weekday
from messages m
where m.sender in
(select a.ROWID from addresses a where address like '%cstoller%')
group by weekday
order by strftime('%w', m.date_sent) asc;
For my email I was surprised that I send fewer emails on Thursday than Saturday. On the other hand, I was less surprised that I send the most emails on Monday and Friday.
select count(*), strftime("%H", m.date_sent, 'unixepoch', 'localtime') hour
from messages m
where m.sender not in
(select a.ROWID from addresses a where address like '%cstoller%')
group by hour
order by hour asc;
to
or cc
fields.select count(*), case r.type
when 0 then 'To'
when 1 then 'Cc'
else 'NA'end as mtype
from recipients r
where r.address_id in
(select a.ROWID from addresses a where address like '%cstoller%')
group by mtype
order by r.type asc;
Unclear. This was a curiosity that was pretty easy to itch.
I want to give credit to two authors who inspired this blog post - Sam Pullara and J. S. Floyd.