Walkthrough — pandaSDMX 1.10.0 documentation (2024)

This page walks through an example pandaSDMX workflow, providing explanations of some SDMX concepts along the way.See also Resources, HOWTOs for miscellaneous tasks, and follow links to the Glossary where some terms are explained.

SDMX workflow
Choose and connect to an SDMX web service
- Configure the HTTP connection
- Cache HTTP responses and parsed objects
- Using custom sessions
Obtain and explore metadata
- Get information about the source’s data flows
- Convert metadata to pandas.Series
- Extract the metadata related to a data flow
- Internationalisation
- Understanding constraints
Select and query data from a dataflow
- Choose a data format
- Construct a selection key for a query
- Query data
Convert data to pandas
- Select columns using the model API
- Convert dimensions to pandas.DatetimeIndex or PeriodIndex
- Data types
Work with files
Handle errors

SDMX workflow¶

Working with statistical data often includes some or all of the following steps.pandaSDMX builds on SDMX features to make the steps straightforward:

Choose a data provider.
pandaSDMX provides a built-in list of Data sources.
Investigate what data is available.
Using pandaSDMX, download the catalogue of data flows available from the data provider and select a data flow for further inspection.
Understand what form the data comes in.
Download structure and other metadata on the selected data flow and the data it contains, including the data structure definition, concepts, codelists and content constraints.
Decide what data points you need from the dataflow.
Analyze the structural metadata, by directly inspecting objects or converting them to pandas types.
Download the actual data.
Using pandaSDMX, specify the needed portions of the data from the data flow by constructing a selection (‘key’) of series and a period/time range.Then, retrieve the data using Request.get().
Analyze or manipulate the data.
Convert to pandas types using to_pandas() (or, equivalently, the top level function to_pandas())and use the result in further Python code.

Choose and connect to an SDMX web service¶

First, we instantiate a pandasdmx.Request object, using the string ID of a data source supported by pandaSDMX:

In [1]: import pandasdmx as sdmxIn [2]: ecb = sdmx.Request('ECB')

The object ecb is now ready to make multiple data and metadata queries to the European Central Bank’s web service.To send requests to multiple web services, we could instantiate multiple Requests.

pandaSDMX knows the URLs to the online documentation pages of each data source.The convenience method pandasdmx.api.Request.view_doc() opens it in the standard browser.

New in version 1.3.0.

Configure the HTTP connection¶

pandaSDMX builds on the widely-used requests Python HTTP library.To pre-configure all queries made by a Request, we can pass any of the keyword arguments recognized by requests.request().For example, a proxy server can be specified:

In [3]: ecb_via_proxy = sdmx.Request( ...:  'ECB', ...:  proxies={'http': 'http://1.2.3.4:5678'} ...: ) ...:

The session attribute is a familiar requests.Session object that can be used to inspect and modify configuration between queries:

In [4]: ecb_via_proxy.session.proxiesOut[4]: {'http': 'http://1.2.3.4:5678'}

For convenience, timeout stores the timeout in seconds for HTTP requests, and is passed automatically for all queries.

Cache HTTP responses and parsed objects¶

New in version 0.3.0.

Using custom sessions¶

New in version 1.0.0.

The Request constructor takes an optional keyword argument session.For instance, a requests.Session with pre-mounted adaptersor patched by an alternative caching library such as CacheControlcan be passed:

>>> awesome_ecb_req = Request('ECB', session=my_awesome_session)

Obtain and explore metadata¶

This section illustrates how to download and explore metadata.Suppose we are looking for time-series on exchange rates, and we know that the European Central Bank provides a relevant data flow.

We could search the Internet for the dataflow ID or browse the ECB’s website.However, we can also use pandaSDMX to retrieve metadata and get a complete overview of the dataflows the ECB provides.

Get information about the source’s data flows¶

We use pandaSDMX to download the definitions for all data flows available from our chosen source.We could call Request.get() with [resource_type=]'dataflow' as the first argument, but can also use a shorter alias:

In [6]: flow_msg = ecb.dataflow()

The query returns a Message instance.We can also see the URL that was queried and the response headers by accessing the Message.response attribute:

In [7]: flow_msg.response.urlOut[7]: 'https://sdw-wsrest.ecb.europa.eu/service/dataflow/ECB/latest'In [8]: flow_msg.response.headersOut[8]: {'Server': 'myracloud', 'Date': 'Sat, 25 Feb 2023 19:58:27 GMT', 'Content-Type': 'application/vnd.sdmx.structure+xml;version=2.1', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'vary': 'accept-encoding, accept, origin', 'Content-Encoding': 'gzip', 'Cache-Control': 'no-cache, no-store, max-age=0', 'ETag': '"myra-4df78e45"'}

All the content of the response—SDMX data and metadata objects—has been parsed and is accessible from flow_msg.Let’s find out what we have received:

In [9]: flow_msgOut[9]: <pandasdmx.StructureMessage> <Header> id: 'IREF551058' prepared: '2023-02-25T19:58:27+00:00' receiver: <Agency not_supplied> sender: <Agency Unknown> source:  test: False response: <Response [200]> DataflowDefinition (75): AME BKN BLS BNT BOP BSI BSP CBD CBD2 CCP CIS... DataStructureDefinition (57): ECB_AME1 ECB_BKN1 ECB_BLS1 ECB_BOP_BNT ...

The string representation of the Message shows us a few things:

This is a Structure-, rather than DataMessage.
It contains 67 DataflowDefinition objects.Because we didn’t specify an ID of a particular data flow, we received the definitions for all data flows available from the ECB web service.
The first of these have ID attributes like ‘AME’, ‘BKN’, …

We could inspect these each individually using StructureMessage.dataflow attribute, a DictLike object that allows attribute- and index-style access:

In [10]: flow_msg.dataflow.BOPOut[10]: <DataflowDefinition ECB:BOP(1.0): Euro Area Balance of Payments and International Investment Position Statistics>

Convert metadata to `pandas.Series`¶

However, an easier way is to use pandasdmx to convert some of the information to a pandas.Series:

In [11]: dataflows = sdmx.to_pandas(flow_msg.dataflow)In [12]: dataflows.head()Out[12]: AME AMECOBKN Banknotes statisticsBLS Bank Lending Survey StatisticsBNT Shipments of Euro Banknotes Statistics (ESCB)BOP Euro Area Balance of Payments and Internationa...dtype: objectIn [13]: len(dataflows)Out[13]: 75

to_pandas() accepts most instances and Python collections of pandasdmx.model objects, and we can use keyword arguments to control how each of these is handled.Under the hood, it calls pandasdmx.writer.write(). See the function documentation for details.

If we want to export the entire message content to pandas rather thanselecting some resource such as dataflows as in the above example, the pandasdmx.message.Message.to_pandas()comes in handy.

As we are interested in exchange rate data, let’s use built-in Pandas methods to find an appropriate data flow:

In [14]: dataflows[dataflows.str.contains('exchange', case=False)]Out[14]: EXR Exchange RatesFXI Foreign Exchange StatisticsSEE Securities exchange - Trading Statisticsdtype: object

We decide to look at ‘EXR’.

Some agencies, including ECB and INSEE, offer categorizations of data flows to help with this step.See this HOWTO entry.

Extract the metadata related to a data flow¶

We will download the data flow definition with the ID ‘EXR’ from the European Central Bank.This data flow definition is already contained in the flow_msg we retrieved with the last query, but without the data structure or any related metadata.Now we will pass the data flow ID ‘EXR’, which prompts pandaSDMX to set the references query parameter to ‘all’.The ECB SDMX service responds by returning all metadata related to the dataflow:

# Here we could also use the object we have in hand:# exr_msg = ecb.dataflow(resource=flow_msg.dataflow.EXR)In [15]: exr_msg = ecb.dataflow('EXR')In [16]: exr_msg.response.urlOut[16]: 'https://sdw-wsrest.ecb.europa.eu/service/dataflow/ECB/EXR/latest?references=all'# The response includes several classes of SDMX objectsIn [17]: exr_msgOut[17]: <pandasdmx.StructureMessage> <Header> id: 'IREF551059' prepared: '2023-02-25T19:58:27+00:00' receiver: <Agency not_supplied> sender: <Agency Unknown> source:  test: False response: <Response [200]> Categorisation (2): 21e97b57-5950-eaab-eead-1534306b28af 53A341E8-D48... CategoryScheme (2): WDC_NAVI WDC_NAVI_OLD Codelist (11): CL_COLLECTION CL_CURRENCY CL_DECIMALS CL_EXR_SUFFIX CL... ConceptScheme (1): ECB_CONCEPTS ContentConstraint (1): EXR_CONSTRAINTS DataflowDefinition (1): EXR DataStructureDefinition (1): ECB_EXR1 AgencyScheme (1): AGENCIESIn [18]: exr_flow = exr_msg.dataflow.EXR

The DataflowDefinition.structure attribute refers to the data structure definition (DSD, an instance of DataStructureDefinition).As the name implies, this object contains metadata that describes the structure of data in the ‘EXR’ flow:

# Show the data structure definition referred to by the data flowIn [19]: dsd = exr_flow.structureIn [20]: dsdOut[20]: <DataStructureDefinition ECB:ECB_EXR1(1.0): Exchange Rates># The same object instance is accessible from the StructureMessageIn [21]: dsd is exr_msg.structure.ECB_EXR1Out[21]: True

Among other things, the DSD defines:

the order and names of the Dimensions, and the allowed values, data type or codes for each dimension, and
the names, allowed values, and valid points of attachment for DataAttributes.
the PrimaryMeasure, i.e. a description of the thing being measured by the observation values.

# Explore the DSDIn [22]: dsd.dimensions.componentsOut[22]: [<Dimension FREQ>, <Dimension CURRENCY>, <Dimension CURRENCY_DENOM>, <Dimension EXR_TYPE>, <Dimension EXR_SUFFIX>, <TimeDimension TIME_PERIOD>]In [23]: dsd.attributes.componentsOut[23]: [<DataAttribute TIME_FORMAT>, <DataAttribute OBS_STATUS>, <DataAttribute OBS_CONF>, <DataAttribute OBS_PRE_BREAK>, <DataAttribute OBS_COM>, <DataAttribute BREAKS>, <DataAttribute COLLECTION>, <DataAttribute COMPILING_ORG>, <DataAttribute DISS_ORG>, <DataAttribute DOM_SER_IDS>, <DataAttribute PUBL_ECB>, <DataAttribute PUBL_MU>, <DataAttribute PUBL_PUBLIC>, <DataAttribute UNIT_INDEX_BASE>, <DataAttribute COMPILATION>, <DataAttribute COVERAGE>, <DataAttribute DECIMALS>, <DataAttribute NAT_TITLE>, <DataAttribute SOURCE_AGENCY>, <DataAttribute SOURCE_PUB>, <DataAttribute TITLE>, <DataAttribute TITLE_COMPL>, <DataAttribute UNIT>, <DataAttribute UNIT_MULT>]In [24]: dsd.measures.componentsOut[24]: [<PrimaryMeasure OBS_VALUE>]

Choosing just the FREQ dimension, we can explore the Codelist that contains valid values for this dimension in the data flow:

# Show a codelist referenced by a dimension, containing a superset# of existing valuesIn [25]: cl = dsd.dimensions.get('FREQ').local_representation.enumeratedIn [26]: clOut[26]: <Codelist ECB:CL_FREQ(1.0) (10 items): Frequency code list># Again, the same object can be accessed directlyIn [27]: cl is exr_msg.codelist.CL_FREQOut[27]: True# Convert to a pandas.Series to see more informationIn [28]: sdmx.to_pandas(cl)Out[28]:  name parentCL_FREQ A Annual CL_FREQB Daily - businessweek CL_FREQD Daily CL_FREQE Event (not supported) CL_FREQH Half-yearly CL_FREQM Monthly CL_FREQN Minutely CL_FREQQ Quarterly CL_FREQS Half-yearly, semester (value introduced in 200... CL_FREQW Weekly CL_FREQ

Internationalisation¶

Data providers may include names and descriptions of dataflows,dimensions, codes etc. in multiple languages. In the information model, this is reflected in the model.InternationalString.When exporting such metadata to a pandas object, the language is selected in two stages. First, a global default localesetting is used. When importing pandaSDMX, this default locale is always set to”en” as most data providers commonly include English strings. Second, if there is noEnglish version of a given attribute, the fallback is to take the first language found in the InternationalString.

You can change the default locale prior to exporting metadata to pandas through a convenientproperty on the Request as follows:

In [29]: ecb.default_localeOut[29]: 'en'In [30]: ecb.default_locale = "fr"# Note that this setting is global, not per Request instance.In [31]: insee_flows = sdmx.Request('insee').dataflow()In [32]: sdmx.to_pandas(insee_flows.dataflow).head()Out[32]: BALANCE-PAIEMENTS Balance des paiementsCHOMAGE-TRIM-NATIONAL Chômage, taux de chômage par sexe et âge (sens...CLIMAT-AFFAIRES Indicateurs synthétiques du climat des affairesCNA-2010-CONSO-MEN Consommation des ménages - Résultats par produ...CNA-2010-CONSO-SI Dépenses de consommation finale par secteur in...dtype: objectIn [33]: ecb.default_locale = "en"In [34]: sdmx.to_pandas(insee_flows.dataflow).head()Out[34]: BALANCE-PAIEMENTS Balance of paymentsCHOMAGE-TRIM-NATIONAL Unemployment, unemployment rate and halo by se...CLIMAT-AFFAIRES Business climate composite indicatorsCNA-2010-CONSO-MEN Households' consumption - Results by product, ...CNA-2010-CONSO-SI Final consumption expenditure by institutional...dtype: object

New in version 1.8.1.

Understanding constraints¶

The CURRENCY and CURRENCY_DENOM dimensions of this DSD are both represented using the same CL_CURRENCY code list.In order to be reusable for as many data sets as possible, this code list is extensive and complete:

In [35]: len(exr_msg.codelist.CL_CURRENCY)Out[35]: 367

However, the European Central Bank does not, in its ‘EXR’ data flow, commit to providing exchange rates between—for instance—the Congolose franc (‘CDF’) and Peruvian sol (‘PEN’).In other words, the values of (CURRENCY, CURRENCY_DENOM) that we can expect to find in ‘EXR’ is much smaller than the 359 × 359 possible combinations of two values from CL_CURRENCY.

How much smaller?Let’s return to explore the ContentConstraint that came with our metadata query:

In [36]: exr_msg.constraint.EXR_CONSTRAINTSOut[36]: <ContentConstraint EXR_CONSTRAINTS: Constraints for the EXR dataflow.># Get the content 'region' included in the constraintIn [37]: cr = exr_msg.constraint.EXR_CONSTRAINTS.data_content_region[0]# Get the valid members for two dimensionsIn [38]: c1 = sdmx.to_pandas(cr.member['CURRENCY'].values)In [39]: len(c1)Out[39]: 61In [40]: c2 = sdmx.to_pandas(cr.member['CURRENCY_DENOM'].values)In [41]: len(c2)Out[41]: 62# Explore the contents# Currencies that are valid for CURRENCY_DENOM, but not CURRENCYIn [42]: c1, c2 = set(c1), set(c2)In [43]: c2 - c1Out[43]: {'AED', 'ATS', 'BEF', 'CLP', 'COP', 'DEM', 'ESP', 'EUR', 'FIM', 'FRF', 'IEP', 'ITL', 'LUF', 'NLG', 'PEN', 'PTE', 'SAR', 'UAH'}# The opposite:In [44]: c1 - c2Out[44]: {'E01', 'E02', 'E03', 'E5', 'E7', 'E8', 'EGP', 'H00', 'H01', 'H02', 'H03', 'H10', 'H11', 'H37', 'H42', 'H7', 'H8'}# Check certain contentsIn [45]: {'CDF', 'PEN'} < c1 | c2Out[45]: FalseIn [46]: {'USD', 'JPY'} < c1 & c2Out[46]: True

We also see that ‘USD’ and ‘JPY’ are valid values along both dimensions.

Attribute names and allowed values can be obtained in a similar fashion.

Select and query data from a dataflow¶

Next, we will query some data.The step is simple: call Request.get() with resource_type=’data’ as the first argument, or the alias Request.data().

First, however, we describe some of the many options offered by SDMX and pandaSDMX for data queries.

Choose a data format¶

Web services offering SDMX-ML–formatted DataMessages can return them in one of two formats:

Generic data

use XML elements that explicitly identify whether values associated with an Observation are dimensions, or attributes.

For example, in the ‘EXR’ data flow, the XML content for the CURRENCY_DENOM dimension and for the OBS_STATUS attribute are stored differently:

<generic:Obs> <generic:ObsKey> <!-- NB. Other dimensions omitted. --> <generic:Value value="EUR" id="CURRENCY_DENOM"/> <!-- … --> </generic:ObsKey> <generic:ObsValue value="0.82363"/> <generic:Attributes> <!-- NB. Other attributes omitted. --> <generic:Value value="A" id="OBS_STATUS"/> <!-- … --> </generic:Attributes></generic:Obs>

Structure-specific data

use a more concise format:

<!-- NB. Other dimensions and attributes omitted: --><Obs CURRENCY_DENOM="EUR" OBS_VALUE="0.82363" OBS_STATUS="A" />

This can result in much smaller messages.However, because this format does not distinguish dimensions and attributes, it cannot be properly parsed by pandaSDMX without separately obtaining the data structure definition.

pandaSDMX adds appropriate HTTP headers for retrieving structure-specific data (see implementation notes).In general, to minimize queries and message size:

First query for the DSD associated with a data flow.
When requesting data, pass the obtained object as the dsd= argument to Request.get() or Request.data().

This allows pandaSDMX to retrieve structure-specific data whenever possible.It can also avoid an additional request when validating data query keys (below).

Construct a selection key for a query¶

SDMX web services can offer access to very large data flows.Queries for all the data in a data flow are not usually necessary, and in some cases servers will refuse to respond.By selecting a subset of data, performance is increased.

The SDMX REST API offers two ways to narrow a data request:

specify a key, i.e. values for 1 or more dimensions to be matched by returned Observations and SeriesKeys.The key is included as part of the URL constructed for the query.Using pandaSDMX, a key is specified by the key= argument to Request.get.
limit the time period, using the HTTP parameters ‘startPeriod’ and ‘endPeriod’.Using pandaSDMX, these are specified using the params= argument to Request.get.

From the ECB’s dataflow on exchange rates, we specify the CURRENCY dimension to contain either of the codes ‘USD’ or ‘JPY’.The documentation for Request.get() describes the multiple forms of the key argument and the validation applied.The following are all equivalent:

In [47]: key = dict(CURRENCY=['USD', 'JPY'])In [48]: key = '.USD+JPY...'

We also set a start period to exclude older data:

In [49]: params = dict(startPeriod='2016')

Another way to validate a key against valid codes are series-key-only datasets, i.e. a dataset with all possible series keys where no series contains any observation.pandaSDMX supports this validation method as well.However, it is disabled by default.Pass series_keys=True to the Request method to validate a given key against a series-keys only dataset rather than the DSD.

Query data¶

Finally, we request the data in generic format:

In [50]: import sysIn [51]: ecb = sdmx.Request('ECB', backend='memory')In [52]: data_msg = ecb.data('EXR', key=key, params=params)# Generic data was returnedIn [53]: data_msg.response.headers['content-type']Out[53]: 'application/vnd.sdmx.genericdata+xml;version=2.1'# Number of bytes in the cached responseIn [54]: bytes1 = sys.getsizeof(ecb.session.cache.responses.popitem()[1].content)In [55]: bytes1Out[55]: 952591

To demonstrate a query for a structure-specific data set, we pass the DSD obtained in the previous section:

In [56]: ss_msg = ecb.data('EXR', key=key, params=params, dsd=dsd)# Structure-specific data was requested and returnedIn [57]: ss_msg.response.request.headers['accept']Out[57]: 'application/vnd.sdmx.structurespecificdata+xml;version=2.1'In [58]: ss_msg.response.headers['content-type']Out[58]: 'application/vnd.sdmx.structurespecificdata+xml;version=2.1'# Number of bytes in the cached responseIn [59]: bytes2 = sys.getsizeof(ecb.session.cache.responses.popitem()[1].content)In [60]: bytes2 / bytes1Out[60]: 0.34049660347410377

The structure-specific message is a fraction of the size of the generic message.

In [61]: data = data_msg.data[0]In [62]: type(data)Out[62]: pandasdmx.model.GenericDataSetIn [63]: len(data.series)Out[63]: 16In [64]: list(data.series.keys())[5]Out[64]: <SeriesKey: FREQ=D, CURRENCY=USD, CURRENCY_DENOM=EUR, EXR_TYPE=SP00, EXR_SUFFIX=A>In [65]: set(series_key.FREQ for series_key in data.series.keys())Out[65]: {<KeyValue: FREQ=A>, <KeyValue: FREQ=D>, <KeyValue: FREQ=H>, <KeyValue: FREQ=M>, <KeyValue: FREQ=Q>}

This dataset thus comprises 16 time series of several different period lengths.We could have chosen to request only daily data in the first place by providing the value ‘D’ for the FREQ dimension.In the next section we will show how columns from a dataset can be selected through the information model when writing to a pandas object.

Convert data to pandas¶

Select columns using the model API¶

As we want to write data to a pandas DataFrame rather than an iterator of pandas Series, we avoid mixing up different frequencies as pandas may raise an error when passed data with incompatible frequencies.Therefore, we single out the series with daily data.to_pandas() method accepts an optional iterable to select a subset of the series contained in the dataset.Thus we can now generate our pandas DataFrame from daily exchange rate data only:

In [66]: import pandas as pdIn [67]: daily = [s for sk, s in data.series.items() if sk.FREQ == 'D']In [68]: cur_df = pd.concat(sdmx.to_pandas(daily)).unstack()In [69]: cur_df.shapeOut[69]: (2, 1834)In [70]: cur_df.tail()Out[70]: TIME_PERIOD 2016-01-04 ... 2023-02-24FREQ CURRENCY CURRENCY_DENOM EXR_TYPE EXR_SUFFIX ... D JPY EUR SP00 A 129.7800 ... 143.550 USD EUR SP00 A 1.0898 ... 1.057[2 rows x 1834 columns]

Convert dimensions to `pandas.DatetimeIndex` or `PeriodIndex`¶

SDMX datasets often have a Dimension with a name like TIME_PERIOD.To ease further processing of time-series data read from pandasdmx messages, write_dataset() provides a datetime argument to convert these into pandas.DatetimeIndex and PeriodIndex classes.

For multi-dimensional datasets, write_dataset() usually returns a pandas.Series with a MultiIndex that has one level for each dimension.However, MultiIndex and DatetimeIndex/PeriodIndex are incompatible; it is not possible to use pandas’ date/time features for just one level of a MultiIndex (e.g. TIME_PERIOD) while using other types for the other levels/dimensions (e.g. strings for CURRENCY).

For this reason, when the datetime argument is used, write_dataset() returns a DataFrame: the DatetimeIndex/PeriodIndex is used along axis 0, and all other dimensions are collected in a MultiIndex on axis 1.

An example, using the same data flow as above:

In [71]: key = dict(CURRENCY_DENOM='EUR', FREQ='M', EXR_SUFFIX='A')In [72]: params = dict(startPeriod='2019-01', endPeriod='2019-06')In [73]: data = ecb.data('EXR', key=key, params=params).data[0]

Without date-time conversion, to_pandas() produces a MultiIndex:

In [74]: sdmx.to_pandas(data)Out[74]: FREQ CURRENCY CURRENCY_DENOM EXR_TYPE EXR_SUFFIX TIME_PERIODM ARS EUR SP00 A 2019-01 42.736877 2019-02 43.523655 2019-03 46.479714 2019-04 48.520795 2019-05 50.155077 ...  ZAR EUR SP00 A 2019-02 15.687945 2019-03 16.250743 2019-04 15.895875 2019-05 16.137123 2019-06 16.474880Name: value, Length: 336, dtype: float64

With date-time conversion, it produces a DatetimeIndex:

In [75]: df1 = sdmx.to_pandas(data, datetime='TIME_PERIOD')In [76]: df1.indexOut[76]: DatetimeIndex(['2019-01-01', '2019-02-01', '2019-03-01', '2019-04-01', '2019-05-01', '2019-06-01'], dtype='datetime64[ns]', name='TIME_PERIOD', freq=None)In [77]: df1Out[77]: FREQ M ... CURRENCY ARS AUD BGN ... TWD USD ZARCURRENCY_DENOM EUR EUR EUR ... EUR EUR EUREXR_TYPE SP00 SP00 SP00 ... SP00 SP00 SP00EXR_SUFFIX A A A ... A A ATIME_PERIOD ... 2019-01-01 42.736877 1.597514 1.9558 ... 35.201205 1.141641 15.8169502019-02-01 43.523655 1.589500 1.9558 ... 34.963125 1.135115 15.6879452019-03-01 46.479714 1.595890 1.9558 ... 34.877605 1.130248 16.2507432019-04-01 48.520795 1.580175 1.9558 ... 34.676925 1.123825 15.8958752019-05-01 50.155077 1.611641 1.9558 ... 34.967468 1.118459 16.1371232019-06-01 49.506670 1.626430 1.9558 ... 35.332025 1.129340 16.474880[6 rows x 56 columns]

Use the advanced functionality to specify a dimension for the frequency of a PeriodIndex, and change the orientation so that the PeriodIndex is on the columns:

In [78]: df2 = sdmx.to_pandas( ....:  data, ....:  datetime=dict(dim='TIME_PERIOD', freq='FREQ', axis=1)) ....: In [79]: df2.columnsOut[79]: PeriodIndex(['2019-01', '2019-02', '2019-03', '2019-04', '2019-05', '2019-06'], dtype='period[M]', name='TIME_PERIOD')In [80]: df2Out[80]: TIME_PERIOD 2019-01 ... 2019-06CURRENCY CURRENCY_DENOM EXR_TYPE EXR_SUFFIX ... ARS EUR SP00 A 42.736877 ... 49.506670AUD EUR SP00 A 1.597514 ... 1.626430BGN EUR SP00 A 1.955800 ... 1.955800BRL EUR SP00 A 4.269982 ... 4.360035CAD EUR SP00 A 1.519614 ... 1.501140CHF EUR SP00 A 1.129700 ... 1.116705CNY EUR SP00 A 7.750350 ... 7.793685CZK EUR SP00 A 25.650091 ... 25.604800DKK EUR SP00 A 7.465736 ... 7.466945DZD EUR SP00 A 135.144305 ... 134.583205E01 EUR EN00 A 101.498500 ... 101.315800 ERC0 A 102.930800 ... 102.664100 ERP0 A 99.767800 ... 98.929400E02 EUR EN00 A 98.725100 ... 98.689200 ERC0 A 94.219500 ... 93.825200 ERP0 A 93.095500 ... 92.611000E03 EUR EN00 A 116.457900 ... 116.342900 ERC0 A 93.673100 ... 93.099100E5 EUR EN00 A 98.690400 ... 98.652700 ERC0 A 94.156000 ... 93.766300 ERP0 A 93.074700 ... 92.585100E7 EUR EN00 A 116.304700 ... 116.189100 ERC0 A 93.628800 ... 93.061700E8 EUR EN00 A 101.499000 ... 101.315800 ERC0 A 102.935900 ... 102.673800 ERP0 A 99.764800 ... 98.926500GBP EUR SP00 A 0.886030 ... 0.891073H37 EUR NN00 A 110.775895 ... 112.853400 NRC0 A 105.645943 ... 108.202159H42 EUR NN00 A 111.613486 ... 112.945574 NRC0 A 105.737631 ... 107.978611HKD EUR SP00 A 8.952745 ... 8.838280HRK EUR SP00 A 7.428550 ... 7.407885HUF EUR SP00 A 319.800455 ... 322.558500IDR EUR SP00 A 16164.770000 ... 16060.272000ILS EUR SP00 A 4.207545 ... 4.062430INR EUR SP00 A 80.798273 ... 78.407800ISK EUR SP00 A 136.659091 ... 140.820000JPY EUR SP00 A 124.341364 ... 122.080500KRW EUR SP00 A 1281.459091 ... 1325.280500MAD EUR SP00 A 10.882377 ... 10.847505MXN EUR SP00 A 21.898509 ... 21.782680MYR EUR SP00 A 4.700118 ... 4.696800NOK EUR SP00 A 9.763105 ... 9.746480NZD EUR SP00 A 1.685018 ... 1.711855PHP EUR SP00 A 59.882455 ... 58.425100PLN EUR SP00 A 4.291595 ... 4.263505RON EUR SP00 A 4.706182 ... 4.725005RUB EUR SP00 A 76.305455 ... 72.402775SEK EUR SP00 A 10.268541 ... 10.626295SGD EUR SP00 A 1.548614 ... 1.539010THB EUR SP00 A 36.318273 ... 35.138600TRY EUR SP00 A 6.136482 ... 6.561920TWD EUR SP00 A 35.201205 ... 35.332025USD EUR SP00 A 1.141641 ... 1.129340ZAR EUR SP00 A 15.816950 ... 16.474880[56 rows x 6 columns]

Warning

For large datasets, parsing datetimes may reduce performance.

Data types¶

By default, pandaSDMX converts observation values tofloat64, other values to str. While this is a pragmatic choicein most scenarios, the default behavior can be overridden in two ways:

specify the dtype to which observation values will be converted.To achieve this, pass dtype=<sometype> to pandasdmx.api.to_pandas().
Cast the facet value type for observation values to a correspondingpandas dtype. To do this, call pandas.api.to_pandas()with dtype_from_dsd=True, dsd=<my_dsd(new in verson 1.10.0).

Work with files¶

Request.get() accepts the optional keyword argument tofile.If given, the response from the web service is written to the specified file, and the parse Message returned.

New in version 0.2.1.

A file-like may be passed in a with-context. And OpenFile instances fromFSSPEC may be used,e.g., to access a cloud storage provider’s file system.

New in version 1.2.0.

Likewise, read_sdmx() can be usedto load SDMX messages stored in local files orremote files usingFSSPEC.

Handle errors¶

Message.response carries the requests.Response.status_code attribute;in the successful queries above, the status code is 200.The SDMX web services guidelines explain the meaning of other codes.In addition, if the SDMX server has encountered an error, it may return a Message with a footer containing explanatory notes.pandaSDMX exposes footer content as Message.footer and Footer.text.

Note

pandaSDMX raises only HTTP errors with status code between 400 and 499.Codes >= 500 do not raise an error as the SDMX web services guidelines define special meanings to those codes.The caller must therefore raise an error if needed.