Pandas
This page records some of my experiences using Pandas.
Data Slicing¶
Usually, data slicing is achieved using df.loc[row_conditions, columns]
. Due to the calculation rules of logical operators in Pandas, the syntax for row_conditions
can often be cumbersome. For example:
As you can see, writing complex logical expressions can be tedious. Therefore, the method I currently use is to further encapsulate .loc
in Pandas with .query()
:
In .query()
, it is also easy to call variables outside of the DataFrame, such as a constant k
that I need for filtering. Both of the following methods work:
tmp = df.query('a == @k')
tmp = df.query(f'a == {k}') # This method cannot be used if k is a list-type variable
However, it is worth noting that .query()
cannot be used for column selection (equivalent to the WHERE clause in SQL). Therefore, if you need to select specific columns, you can write it like this: