Earnings Calendar Issues

HI,

I am using the earnings calendar and i choose a stock such as AAPL but the data shows 5 earnings date in 2018. Why is the data messed up? 1/3/2018, 4/3/2018, 7/3/2018, 10/3/2018, 11/02/2018. Adding extra earnings day is not ideal as now i have to filter through the data.

Any explanation much help.

Thanks

1 response

First, want to verify you are using the EventVestor EarningsCalendar dataset. This seems the case.

So, there are four typical fields used in this dataset:

next_asof_date - datetime64[ns]
previous_asof_date - datetime64[ns]
next_announcement - datetime64[ns]
previous_announcement - datetime64[ns]



There's really just two pieces of data - previous_announcement and next_announcement. These are both dates. Each of these have an associated asof_date. So, for example, every time a new 'earnings announcement' date is posted it will have a new associated 'asof_date' which is the date it was posted. Below are the four fields for AAPL during 2018.

PIPELINE DATE              next_ann   next_asof_date prev_ann prev_asof_date

2018-01-02 00:00:00+00:00   NaT         NaT         2017-11-02  2017-10-04
2018-01-04 00:00:00+00:00   2018-02-01  2018-01-03  2017-11-02  2017-10-04
2018-02-01 00:00:00+00:00   2018-02-01  2018-01-03  2018-02-01  2018-01-03
2018-02-02 00:00:00+00:00   NaT         NaT         2018-02-01  2018-01-03
2018-04-04 00:00:00+00:00   2018-05-01  2018-04-03  2018-02-01  2018-01-03
2018-05-01 00:00:00+00:00   2018-05-01  2018-04-03  2018-05-01  2018-04-03
2018-05-02 00:00:00+00:00   NaT         NaT         2018-05-01  2018-04-03
2018-07-05 00:00:00+00:00   2018-07-31  2018-07-03  2018-05-01  2018-04-03
2018-07-31 00:00:00+00:00   2018-07-31  2018-07-03  2018-07-31  2018-07-03
2018-08-01 00:00:00+00:00   NaT         NaT         2018-07-31  2018-07-03
2018-10-04 00:00:00+00:00   2018-11-01  2018-10-03  2018-07-31  2018-07-03
2018-11-01 00:00:00+00:00   2018-11-01  2018-10-03  2018-11-01  2018-10-03
2018-11-02 00:00:00+00:00   NaT         NaT         2018-11-01  2018-10-03
2018-11-06 00:00:00+00:00   2019-01-31  2018-11-02  2018-11-01  2018-10-03



Since you are seeing the five dates 1/3/2018, 4/3/2018, 7/3/2018, 10/3/2018, 11/02/2018, you must be looking at the next_asof_date field. This is probably not what want? I would think you want the actual dates (not the date when the company said they would make the announcement). In any case the issue and the fix would be the same.

One issue is that there can be many times a year which a company says they are going to release earnings then change their mind and release them on a different date. This will result in more than four dates a year. Additionally, there will typically be five or more next_earnings dates in a single year - four for the current year and then one for the the following year. That's the situation with AAPL. On 11/02/2018 they stated they will announce their next earnings on 01/31/2019. That added a fifth post for an earnings announcement to 2018.

So, what to do? If one is looking for the most recent 4 earnings dates then something like this:

pipe_output['stock'] = pipe_output.index.get_level_values(level='1')
last_announcements = pipe_output.drop_duplicates(['stock','previous_announcement'], keep='last')
last_4_announcements = last_announcements.groupby(level='security').previous_announcement.nlargest(4)



What this does is first add a new column to the data frame which is a duplicate of the 'security' index. This just makes it easier to use the drop_duplicates method. Then apply the drop_duplicates method to get just the last row where the security and the previous_announcement date are equal. Finally, group by security and take the largest 4 previous_announcementdates. Those will be the last four dates which each company actually announced earnings.

Hope that helps? See attached notebook.

0