Function Repository Resource:

FiveThirtyEightPresidentialPollingAverages

Source Notebook

Import polling data on presidential elections from the website FiveThirtyEight

Contributed by: Bob Sandheinrich

ResourceFunction["FiveThirtyEightPresidentialPollingAverages"][yr]

imports presidential polling data from fivethirtyeight.com for the for the specified election year yr.

ResourceFunction["FiveThirtyEightPresidentialPollingAverages"][{yr1,yr2,}]

combines data from several election years.

Details and Options

The result is a Dataset containing polling averages over time for several US presidential elections.
The "State" field can be a state, congressional district, the District of Columbia or the whole United States.
Each row in the dataset contains the support for one candidate in one area at one moment in time. The support is computed by fivethirtyeight.com by combining polls using their model.
The data is imported from FiveThirtyEight's GitHub repository on each evaluation. This requires an internet connection and takes several seconds. The data is not updated frequently, so it is recommended to store the data locally for repeated usage.
The year can be given as an integer or DateObject.
ResourceFunction["FiveThirtyEightPresidentialPollingAverages"][All] combines all years in a single Dataset.
ResourceFunction["FiveThirtyEightPresidentialPollingAverages"]["Historic"] combines all years except the current election cycle in a single Dataset.

Examples

Basic Examples (3) 

Get polling data for the 1976 US presidential election from fivethirtyeight.com:

In[1]:=
ResourceFunction[
 "FiveThirtyEightPresidentialPollingAverages", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"][1976]
Out[1]=

Retrieve polling data for the 2020 election cycle:

In[2]:=
ResourceFunction[
 "FiveThirtyEightPresidentialPollingAverages", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"][2020]
Out[2]=

Get all polling data:

In[3]:=
polldata = ResourceFunction[
  "FiveThirtyEightPresidentialPollingAverages", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"][All]
Out[3]=

See the included elections dates:

In[4]:=
polldata[Counts, "ElectionDate"]
Out[4]=

See the number of statewide averages for each state or region:

In[5]:=
Normal@polldata[Counts, "State"]
Out[5]=

Scope (3) 

Use a DateObject to specify the date:

In[6]:=
ResourceFunction[
 "FiveThirtyEightPresidentialPollingAverages", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"][
 DateObject[{1992}, "Year", "Gregorian", -5.`]]
Out[6]=

Get data for two elections:

In[7]:=
reagandata = ResourceFunction[
  "FiveThirtyEightPresidentialPollingAverages", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"][{1980, 1984}]
Out[7]=
In[8]:=
reagandata[Counts, {"CandidateName", "Cycle"}]
Out[8]=

Get all polling data without the current election cycle using "Historic":

In[9]:=
ResourceFunction[
 "FiveThirtyEightPresidentialPollingAverages", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"]["Historic"]
Out[9]=

Applications (3) 

Get current polling for the 2024 election and group all the polls by pollster and population, then split by polling date:

In[10]:=
current = ResourceFunction[
   "FiveThirtyEightPresidentialPollingAverages", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"][2024];
bypollster = current[GroupBy[{#"Pollster", #"Population"} &]][All, GroupBy["EndDate"]];
bypollster[All, Keys]
Out[12]=

For each date, find a single value for each candidate. The Min will use many-candidate results over two-candidate results:

In[13]:=
bypollster = bypollster[All, All, GroupBy["CandidateName"], Min@*Lookup["Pct"]][
   All, Select[KeyExistsQ[Entity["Person", "JosephBiden::9g8qp"]]]];
First[bypollster]
Out[14]=

Create time series from the data, eliminating pollsters without recent or many polls:

In[15]:=
rawseries = Transpose /@ Normal[bypollster[All, All, #[Entity["Person", "JosephBiden::9g8qp"] ] - #[
         Entity["Person", "DonaldTrump::6vv3q"]] &][
     All, {Keys[#], Values[#]} &]];
longseries = TimeSeries /@ Select[rawseries, Length[#] > 10 && #[[1, 1]] > (Today - Quantity[2, "Weeks"]) && #[[-1, 1]] < (Today - Quantity[6, "Months"]) &]
Out[16]=

Plot all the series:

In[17]:=
DateListPlot[longseries, PlotRange -> {{(Today - Quantity[6, "Months"]), Today}, All}]
Out[17]=

Take a simple average over the pollsters at regular intervals and plot the result:

In[18]:=
DateListPlot@
 Table[{date, Mean[(#[date]) & /@ longseries]}, {date, DateRange[(Today - Quantity[6, "Months"]), (Today - Quantity[2, "Weeks"]), Quantity[1, "Weeks"]]}]
Out[18]=

Select data from just one election:

In[19]:=
oneelection = ResourceFunction[
  "FiveThirtyEightPresidentialPollingAverages", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"][2008]
Out[19]=

See the candidates:

In[20]:=
oneelection[Counts, "CandidateName"]
Out[20]=

Get the last result for each state:

In[21]:=
bystate = oneelection[GroupBy["State"], GroupBy["CandidateName"], Last, "PctEstimate"]
Out[21]=

Calculate the estimated difference in support between the two candidates:

In[22]:=
diffbystate = bystate[All, #[Entity["Person", "BarackObama::7yj6w"]] - #[
     Entity["Person", "JohnMcCain::vw4dk"]] &]
Out[22]=

Calculate timeline plots for each state:

In[23]:=
timelines = oneelection[GroupBy["State"], GroupBy["CandidateName"], All, {"ModelDate", "PctEstimate"}][All, DateListPlot, Values/*TimeSeries];
timelines[
 Entity["AdministrativeDivision", {"Missouri", "UnitedStates"}]]
Out[24]=

Create a map of the support difference which shows the timelines when hovered:

In[25]:=
GeoRegionValuePlot[
 MapThread[
  Tooltip[#1[[1]], #2] -> #1[[2]] &, {Normal[
    Normal[KeyDrop[diffbystate, Entity["Country", "UnitedStates"]]]], Normal@Values[
     KeyDrop[timelines, Entity["Country", "UnitedStates"]]]}]]
Out[25]=

Get data for another election:

In[26]:=
oneelection = Select[ResourceFunction[
   "FiveThirtyEightPresidentialPollingAverages", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"][], #Cycle === DateObject[{2016}] &]
Out[26]=

Create time series data for each state:

In[27]:=
timeseries = oneelection[GroupBy["State"], GroupBy["CandidateName"], All, {"ModelDate", "PctEstimate"}][All, All, Values/*TimeSeries];
RandomChoice[timeseries]
Out[28]=

Get the difference in support between the top two candidates over time for each state with over two hundred estimates:

In[29]:=
diffsbystate = timeseries[All, 1 ;; 2][All, Subtract @@ Values[#] &][
  Select[Length[#["Values"]] > 200 &]]
Out[29]=

Create a list of TimeSeries with samples over the smallest common range:

In[30]:=
Short[list = Normal@Values@
    diffsbystate[All, TimeSeriesResample[#, {diffsbystate[-1]["Dates"]}] &]]
Out[30]=

Calculate the correlations between states:

In[31]:=
corr = .5 + Outer[Correlation, list, list]/2;
Short[corr]
Out[32]=

See the results:

In[33]:=
states = StringDelete[CommonName /@ Normal[Keys[diffsbystate]], ", United States"];
ResourceFunction["DatasetForm"][
 AssociationThread[
  states -> (AssociationThread[states -> #] & /@ corr)]]
Out[34]=

Visualize it:

In[35]:=
Grid[ MapIndexed[
  Item[Tooltip[Graphics[], states[[#2]]], Background -> ColorData["TemperatureMap"][#1]] &, corr, {2}], Spacings -> 0, ItemSize -> .5, Frame -> None]
Out[78]=

Possible Issues (3) 

The 2020 data includes numbers adjusted for convention bounces:

In[79]:=
bidentrump = ResourceFunction[
   "FiveThirtyEightPresidentialPollingAverages", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"][2020];
bidentrump[Counts, "ConventionBounce"]
Out[80]=

The bounce-adjusted numbers differ considerably from the standard average:

In[81]:=
Normal@bidentrump[Select[#State === \!\(\*
NamespaceBox["LinguisticAssistant",
DynamicModuleBox[{Typeset`query$$ = "usa", Typeset`boxes$$ = TemplateBox[{"\"United States\"", 
RowBox[{"Entity", "[", 
RowBox[{"\"Country\"", ",", "\"UnitedStates\""}], "]"}], "\"Entity[\\\"Country\\\", \\\"UnitedStates\\\"]\"", "\"country\""}, "Entity"], Typeset`allassumptions$$ = {},
            Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2}, Typeset`querystate$$ = {"Online" -> True, "Allowed" -> True, "mparse.jsp" -> 1.534023, "Messages" -> {}}}, 
DynamicBox[ToBoxes[
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic, 
Dynamic[Typeset`query$$], 
Dynamic[Typeset`boxes$$], 
Dynamic[Typeset`allassumptions$$], 
Dynamic[Typeset`assumptions$$], 
Dynamic[Typeset`open$$], 
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{145.2265625, {7.11328125, 17.11328125}},
TrackedSymbols:>{Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$, Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}],
DynamicModuleValues:>{},
UndoTrackedVariables:>{Typeset`open$$}],
BaseStyle->{"Deploy"},
DeleteWithContents->True,
Editable->False,
SelectWithContents->True]\) &]][GroupBy["CandidateName"], GroupBy["ConventionBounce"], All, {"ModelDate", "PctEstimate"}][
  All, DateListPlot, Values/*TimeSeries]
Out[81]=

Use only non-bounce-adjusted values instead:

In[82]:=
bidentrump[Select[#State === \!\(\*
NamespaceBox["LinguisticAssistant",
DynamicModuleBox[{Typeset`query$$ = "usa", Typeset`boxes$$ = TemplateBox[{"\"United States\"", 
RowBox[{"Entity", "[", 
RowBox[{"\"Country\"", ",", "\"UnitedStates\""}], "]"}], "\"Entity[\\\"Country\\\", \\\"UnitedStates\\\"]\"", "\"country\""}, "Entity"], Typeset`allassumptions$$ = {},
            Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2}, Typeset`querystate$$ = {"Online" -> True, "Allowed" -> True, "mparse.jsp" -> 1.534023, "Messages" -> {}}}, 
DynamicBox[ToBoxes[
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic, 
Dynamic[Typeset`query$$], 
Dynamic[Typeset`boxes$$], 
Dynamic[Typeset`allassumptions$$], 
Dynamic[Typeset`assumptions$$], 
Dynamic[Typeset`open$$], 
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{145.2265625, {7.11328125, 17.11328125}},
TrackedSymbols:>{Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$, Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}],
DynamicModuleValues:>{},
UndoTrackedVariables:>{Typeset`open$$}],
BaseStyle->{"Deploy"},
DeleteWithContents->True,
Editable->False,
SelectWithContents->True]\) && ! #ConventionBounce &]][
  GroupBy["CandidateName"], All, {"ModelDate", "PctEstimate"}][DateListPlot, Values/*TimeSeries]
Out[82]=

Publisher

Bob

Version History

  • 3.1.1 – 17 April 2024
  • 3.1.0 – 10 April 2024
  • 3.0.0 – 09 October 2020
  • 2.0.0 – 16 September 2020
  • 1.0.0 – 07 July 2020

Related Resources

Author Notes

Version 3.1.1 Notes:

Fixed a date importing bug.

Version 3.1 Notes:

Preliminary, undocumented support for 2024. The data format will probably change once 538 releases a polling average.

Version 3 Notes:

- Added handling for "Candidate Bounce" text that FiveThirtyEight rashly put into the candidate name

Version 2 Notes:

- Added support for 2020 data

- Support limiting historic data by year

- The zero-argument form still works as before, but is deprecated and no longer documented. It is equivalent to "Historic"

TODO:

- Support MaxItems to limit the number of rows of data before interpretation to save time.

License Information