Skip to content

Missions mast#2095

Merged
bsipocz merged 48 commits into
astropy:mainfrom
syed-gilani:missions_mast
Feb 24, 2022
Merged

Missions mast#2095
bsipocz merged 48 commits into
astropy:mainfrom
syed-gilani:missions_mast

Conversation

@syed-gilani

@syed-gilani syed-gilani commented Jun 24, 2021

Copy link
Copy Markdown
Contributor

This PR includes a class that makes POST API call to https://masttest.stsci.edu/search/hst/api/v0.1/search with json data that contains the search criteria. See a sample of json data below:

{"target":[""],"radius":3,"radius_units":"arcminutes","select_cols":["ang_sep","sci_aper_1234","sci_central_wavelength","sci_data_set_name","sci_dec","sci_actual_duration","sci_spec_1234","sci_hlsp","sci_instrume","sci_preview_name","sci_pep_id","sci_ra","sci_refnum","sci_release_date","scp_scan_type","sci_start_time","sci_stop_time","sci_targname"],"user_fields":[],"conditions":[{"sci_obs_type":"all"},{"sci_aec":"S"},{"sci_instrume":"stis,acs,wfc3,cos,fgs,foc,fos,ghrs,hsp,nicmos,wfpc,wfpc2"},{"sci_data_set_name":""},{"sci_pep_id":""},{"sci_pi_last_name":""},{"sci_actual_duration":""},{"sci_spec_1234":""},{"sci_release_date":""},{"sci_start_time":""}],"limit":5000,"offset":0,"sort_by":[],"sort_desc":[],"skip_count":false}

It then converts the json response from the API call into astropy Table.

@pep8speaks

pep8speaks commented Jun 24, 2021

Copy link
Copy Markdown

Hello @syed-gilani! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-02-24 03:44:33 UTC

@ceb8 ceb8 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, I recommend you change this to a WIP PR until this service gets deployed to ops at STScI, as it cannot be added to astroquery before then.

I've mentioned a few specifics in line, but before I really go through it in detail I want to address the larger picture items that I think need to be addressed.

First, I think we may have had a misunderstanding with regards to the overall design of the new functionality. I am confused as to why the user-facing class is called "Datasets" and the non-user-facing class is called "MissionMast". My understanding had been that the user-facing class would be astroquery.mast.MissionMast or similar,  because that is more descriptive than "Datasets" and also matches the nomenclature we are using for the mission-mast interface itself.
In terms of the MissionsSearchAPI class, it is not clear to me why this cannot be folded into the existing ServiceAPI, the structure of the url looks pretty similar, and it would be preferable to generalize the ServiceAPI to also function for the mission interfaces if possible.

For the DatasetsClass, this interface does not follow the API specs required by astroquery. The functions should be along the lines of those provided by the Observations class, "query_object", "query_region", "query_criteria" etc. Also the functions should take advantage of the **kwargs option in python so that the user can supply filters as individual arguments rather than a single json dictionary.

Additionally, while you've called this "Datasets" generally, it only accesses the HST mission search, so there needs to be a way for the user to select the specific mission to search within, and a way to easily expand the code for future mission searches.

For the documentation,  you've added the new search capabilities in the middle of the Observations section. There should be a new section called "Mission Searches" or whatever, and it needs to explain more clearly why a user would be interested in these searches and give links to the relevant STScI documentation.

Comment thread astroquery/mast/tests/test_mast.py Outdated
Comment thread astroquery/mast/__init__.py Outdated
Comment thread astroquery/mast/__init__.py Outdated
@codecov

codecov Bot commented Jun 25, 2021

Copy link
Copy Markdown

Codecov Report

Merging #2095 (4f9bea3) into main (f99e1ba) will increase coverage by 0.08%.
The diff coverage is 80.61%.

❗ Current head 4f9bea3 differs from pull request most recent head 200af24. Consider uploading reports for the commit 200af24 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2095      +/-   ##
==========================================
+ Coverage   62.84%   62.93%   +0.08%     
==========================================
  Files         130      131       +1     
  Lines       16841    16917      +76     
==========================================
+ Hits        10584    10646      +62     
- Misses       6257     6271      +14     
Impacted Files Coverage Δ
astroquery/mast/missions.py 79.45% <79.45%> (ø)
astroquery/mast/services.py 79.56% <84.00%> (+1.20%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f99e1ba...200af24. Read the comment docs.

@bsipocz bsipocz marked this pull request as draft June 29, 2021 23:17

@ceb8 ceb8 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the functionality is quite good, although the code needs some tweaking.

The mocked tests look good. Remoted tests will also be necessary.

The docs still reflect the old Datasets functionality, so they need to be updated. Also the description of this PR needs to be updated to reflect what's in it now.

Just a suggestion, but since this is part of a larger MAST effort it might be a good idea to make an issue with the list of things MAST plans to implement and checky boxes so that users can see what's coming and track progress.

Comment thread astroquery/mast/missions.py Outdated
Comment thread astroquery/mast/missions.py Outdated
Comment thread astroquery/mast/missions.py Outdated
Comment thread astroquery/mast/missions.py Outdated
Comment thread astroquery/mast/missions.py Outdated
Comment thread astroquery/mast/services.py Outdated
col_type = np.int64
ignore_value = -999 if (ignore_value is None) else ignore_value
elif col_type == "double" or col_type == "float" or col_type == "DECIMAL":
elif col_type == "double" or col_type.lower() == "float" or col_type == "DECIMAL":

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be safer to make all of our comparisons in this function case-insensitive.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The to_lower() here addresses this one case. I'm guessing the comment was intended to be broader, perhaps getting the lower case version of col_type at the beginning of the loop, then having all the comparisons be against lower case, e.g., "varchar", "null", "decimal".

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do still worry about all the variations in capitalization for col_type. It would be nice if we were consistent on the server, but that's not the case. Creating or pointing to documentation for the possible values I think could really help. I also think treating all values as lowercase here would be a little safer.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a trace of discussion about this review comment, neither changes to the code. Please don't click "resolve conversation" without actually resolving it. If discussion happened offline, then leave a summary of it at the relevant place. Thanks!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see the to lower has been moved to the top of the loop, so I am going to mark this conversation as resolved.

Comment thread astroquery/mast/tests/mission_results.json Outdated
Comment thread astroquery/mast/tests/test_mast.py Outdated
Comment thread astroquery/mast/tests/test_mast.py Outdated
Comment thread docs/mast/mast.rst
@syed-gilani syed-gilani marked this pull request as ready for review September 21, 2021 14:34
@syed-gilani syed-gilani requested a review from ceb8 September 21, 2021 14:37

@tomdonaldson tomdonaldson left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this work. I think it's looking quite good, but I did have a few questions/comments.

Since I'm unfamiliar with the existing code relating to catalogs, I'd also like to do a deeper dive to explore some of the usage and test coverage, but I didn't want to delay these initial comments for that.

Comment thread astroquery/mast/missions.py Outdated
Comment thread astroquery/mast/services.py Outdated
col_type = np.int64
ignore_value = -999 if (ignore_value is None) else ignore_value
elif col_type == "double" or col_type == "float" or col_type == "DECIMAL":
elif col_type == "double" or col_type.lower() == "float" or col_type == "DECIMAL":

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The to_lower() here addresses this one case. I'm guessing the comment was intended to be broader, perhaps getting the lower case version of col_type at the beginning of the loop, then having all the comparisons be against lower case, e.g., "varchar", "null", "decimal".

Comment thread astroquery/mast/services.py
Comment thread astroquery/mast/services.py Outdated
Comment thread astroquery/mast/services.py Outdated
ignore_value = None
# making type adjustments
if col_type == "char" or col_type == "STRING":
if col_type == "char" or col_type == "STRING" or 'VARCHAR' in col_type or col_type == "NULL":

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be useful to have a brief comment (maybe in the doc string?) explaining the possible values of col_type and which service(s) they might come from.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please do address this, a comment in the code would be sufficient I think

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still stands

Comment thread astroquery/mast/services.py Outdated
Comment thread astroquery/mast/services.py
Comment thread astroquery/mast/services.py
Comment thread astroquery/mast/services.py
Comment thread docs/mast/mast.rst Outdated
.. code-block:: python

>>> from astroquery.mast import Datasets
>>> params = {"target": ["40.66963 -0.01328"],

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation seems a little incomplete to me. As a small example, this code sample doesn't actually do a query. In general as a user, I'd be looking for the following things:

  • At a high level, how do the missions searches differ from the others (i.e., when might I choose to use the missions search?).
  • What queries are available to me?
  • What do each of the keys in the params dict do? (And do astroquery users need to know about all the params shown here?)
  • What is the syntax for specifying "conditions"?
  • (A pointer to) documentation about what columns are available, along with their types so I know what conditions would be applicable.
  • Can I do a follow-up query to retrieve data products?

@bsipocz

bsipocz commented Sep 28, 2021

Copy link
Copy Markdown
Member

@tomdonaldson - maybe you want to ping Erik yet again to add @syed-gilani to the org, so we don't need to approve their each and every commit...

@eteq

eteq commented Sep 30, 2021

Copy link
Copy Markdown
Member

Just sent an invite to @syed-gilani to the org. I did not add them to any of the teams, as I think just org membership is enough to not require the approval. But if there's a team I should add them to, I can do that. (Although I think y'all have admin powers over the astroquery teams?)

@tomdonaldson tomdonaldson left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. This is looking really good.

I've added some minor suggestions for the docs and code coverage.

Comment thread docs/mast/mast.rst Outdated
>>> missions.service
'search'

missions object can be used to search meta data usiong region coordinates. the kwargs can be

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
missions object can be used to search meta data usiong region coordinates. the kwargs can be
The missions object can be used to search meta data using region coordinates. the kwargs can be

Comment thread docs/mast/mast.rst Outdated

missions object can be used to search meta data usiong region coordinates. the kwargs can be
used to specify other filters like selec_cols and sort_by. The available column names to filter and
their description for can be found here https://vao.stsci.edu/missionmast/tapservice.aspx/tables#folder38

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
their description for can be found here https://vao.stsci.edu/missionmast/tapservice.aspx/tables#folder38
their descriptions can be found here https://vao.stsci.edu/missionmast/tapservice.aspx/tables#folder38 (xml)

Comment thread docs/mast/mast.rst Outdated
'search'

missions object can be used to search meta data usiong region coordinates. the kwargs can be
used to specify other filters like selec_cols and sort_by. The available column names to filter and

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
used to specify other filters like selec_cols and sort_by. The available column names to filter and
used to specify output characteristics like select_cols and sort_by. The available column names to filter and

Comment thread docs/mast/mast.rst Outdated

.. code-block:: python

>>> results = missions.query_criteria(select_cols=["sci_stop_time", "sci_targname", "sci_start_time", "sci_status"], sort_by=['sci_targname'])

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this example meant to include some constraints on column values? It would be good to see a numeric and discrete example, I think.

Comment thread docs/mast/mast.rst
Comment thread astroquery/mast/missions.py Outdated

# adding additional user specified parameters
for prop, value in kwargs.items():
params[prop] = value

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fairly fundamental. Should there be test case(s) for params[prop] = value?

params["radius_units"] = 'arcminutes'

if not self._service_api_connection.check_catalogs_criteria_params(criteria):
raise InvalidQueryError("At least one non-positional criterion must be supplied.")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test case?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're ok with missing unit test, I will add it in one of the future updates to this interface.

Comment thread astroquery/mast/services.py
Comment thread astroquery/mast/services.py
@bsipocz bsipocz added the mast label Oct 11, 2021

@tomdonaldson tomdonaldson left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good now. Test coverage improvements can be handled in a subsequent PR.

@tomdonaldson

Copy link
Copy Markdown

@bsipocz I think this is ready for a final look. We've discussed the remaining coverage issues and agreed to file an internal ticket to address them in a future PR. Please advise if you would also like to track that with a github issue.

@bsipocz bsipocz left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @syed-gilani! This is coming along nicely.

Some of my comments are extreme nitpicking, but there are a few ones that should be addressed along with comments left unaddressed in previous reviews from others.

Comment thread astroquery/mast/__init__.py Outdated
Comment thread astroquery/mast/missions.py Outdated
Comment thread astroquery/mast/missions.py Outdated
Comment thread astroquery/mast/missions.py Outdated
Comment thread astroquery/mast/missions.py Outdated
service_dict = {self.service: {'path': self.service, 'args': {}}}
self._service_api_connection.set_service_params(service_dict, f"{self.service}/{self.mission}")

def set_service(self, service):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing something, so I like to see why this, as well as set_mission was done this way, rather than handling it as part of __init__.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these (both set_service() and set_mission()) are not required anymore as the service can be set at initialization.

Comment thread astroquery/mast/services.py
Comment thread astroquery/mast/services.py Outdated
Comment thread astroquery/mast/tests/test_mast.py Outdated
Comment thread astroquery/mast/tests/test_mast.py Outdated
Comment thread astroquery/mast/tests/test_mast.py Outdated
@syed-gilani

Copy link
Copy Markdown
Contributor Author

@bsipocz For making all searchable fields visible in the query_criteria, I will have to add more than 50 arguments to the query_criteria method. There is an endpoint that returns the list of all columns that can be used to filter the search, This is the URL https://mast.stsci.edu/search/util/api/v0.1/column_list. I am not sure if adding more than 50 arguments to a method is a good idea

@syed-gilani

Copy link
Copy Markdown
Contributor Author

@bsipocz Another point to consider is that the name of searchable fields will change from mission to mission. Right now, We only support hst mission but we are working on classy mission and it will be released soon. If we leave it as conditions then we don't have to provide multiple query_citeria for each mission with a different set of arguments

@bsipocz

bsipocz commented Oct 19, 2021

Copy link
Copy Markdown
Member

I will have to add more than 50 arguments to the query_criteria method

Thanks, that's good to know and I agree that in that case it's not something practical. I suppose it would be useful to add a comment about it in the code, so if anyone else also runs into it (or uses mast as an example implementation) they know that it's more the exception for this case.

Also, I wonder whether it would make sense, or just complicate things to check the validity of kwargs. As I said one of my main concerns is that invalid keywords can be passed this way which may result cases where the user assumes a certain behaviour but in fact, their input is totally ignored by the code. Addressing this goes beyond this PR.

Comment thread astroquery/mast/__init__.py Outdated
Comment thread astroquery/mast/missions.py Outdated
Comment thread astroquery/mast/missions.py Outdated
service_dict = {self.service: {'path': self.service, 'args': {}}}
self._service_api_connection.set_service_params(service_dict, f"{self.service}/{self.mission}")

def set_service(self, service):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these (both set_service() and set_mission()) are not required anymore as the service can be set at initialization.

Comment thread astroquery/mast/services.py Outdated
Comment thread astroquery/mast/services.py Outdated
ignore_value = None
# making type adjustments
if col_type == "char" or col_type == "STRING":
if col_type == "char" or col_type == "STRING" or 'VARCHAR' in col_type or col_type == "NULL":

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still stands

Comment thread astroquery/mast/services.py Outdated
col_type = np.int64
ignore_value = -999 if (ignore_value is None) else ignore_value
elif col_type == "double" or col_type == "float" or col_type == "DECIMAL":
elif col_type == "double" or col_type.lower() == "float" or col_type == "DECIMAL":

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a trace of discussion about this review comment, neither changes to the code. Please don't click "resolve conversation" without actually resolving it. If discussion happened offline, then leave a summary of it at the relevant place. Thanks!

@syed-gilani

Copy link
Copy Markdown
Contributor Author

@bsipocz All things identified in the feedback have been resolved. Thanks for all the feedback and let me know if there is anything else you want me to change before this PR can be merged.

@syed-gilani

Copy link
Copy Markdown
Contributor Author

@bsipocz The last outstanding issue that required making mission and service keyword only argument was fixed a few days ago. Is there anything else that you want changed before this PR can be approved?

Comment thread astroquery/mast/missions.py
Comment thread astroquery/mast/tests/test_mast.py
Comment thread astroquery/mast/tests/test_mast.py
Comment thread astroquery/mast/missions.py Outdated
Default 3 degrees.
The string must be parsable by `~astropy.coordinates.Angle`. The
appropriate `~astropy.units.Quantity` object from
`~astropy.units` may also be used. Defaults to 0.2 deg.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It says here Defaults to 0.2 deg but the default is 3 correct?

Comment thread astroquery/mast/missions.py Outdated
@class_or_instance
def query_criteria_async(self, **criteria):
"""
Given an set of filters, returns a list of catalog entries.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I think people might get confused by what filters means in this context.

Comment thread astroquery/mast/missions.py Outdated
Default 3 arcmin.
The string must be parsable by `~astropy.coordinates.Angle`.
The appropriate `~astropy.units.Quantity` object from
`~astropy.units` may also be used. Defaults to 0.2 deg.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same Defaults to 0.2 deg is here

@jaymedina

Copy link
Copy Markdown
Contributor

Hi @syed-gilani , I'll be taking over from Clara in maintaining astroquery's MAST module so I took a look through your PR and it looks like a really useful interface. I had a couple of comments and questions that you can look at before the merge. Functionally this is looking good. Thanks for your time.

Jenny

@bsipocz

bsipocz commented Nov 24, 2021

Copy link
Copy Markdown
Member

I started to go over this again and noticed that there are a few still unaddressed review comments (going back to all the way back to the first reviews from Clara). Some of those may not be relevant anymore, but I would find it useful if any trace of that would be left in the review comments, e.g. comments on the review asks for a missions parameter for given methods. Now, I see we have a missions kwarg for the class installation. So the method parameters may not be relevant any more, but it would enormously help my review task if I didn't have to go back to the commit history level to do a detective work to see which came first, the review or the class kwargs.

Anyway, this has lingered here long enough, so let's say. If both @tomdonaldson and @jaymedina signs-off on it, I'll merge with the understanding the follow-up PRs will come that adds e.g. the tests and any more outstanding comments from this review.

Comment thread astroquery/mast/missions.py Outdated
Valid criteria are coordinates, objectname, radius (as in `query_region` and `query_object`),
and all fields listed in the column documentation for the mission being queried.
Fields that can be used to match results on criteria. See the TAP schema link below for all field names.
https://vao.stsci.edu/missionmast/tapservice.aspx/tables#folder38

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a requirement for the PR, but @tomdonaldson, I wonder maybe there is a human-readable version of this (or a link to a tool that does that, with that XML preloaded into it)? (e.g the workflow of astroquery users to look at the XML of that link to dig up field names feels a bit clunky to me)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree with this comment. Unfortunately we don't support a clean human-readable format for this, and I'd be hesitant to point to a 3rd-party formatter.

@syed-gilani it's not obvious to me what the #folder38 part of the URL does. Are there environments where that would limit what parts of the TAP Schema are shown?

@jaymedina

Copy link
Copy Markdown
Contributor

This looks good. @syed-gilani can you give us an update on the results from running test_mast.py and test_mast_remote.py ? Good to be sure that all the unit tests are passing. Other than that, I approve of these changes and I think it's ready for a merge after a final review from @bsipocz

@tomdonaldson

Copy link
Copy Markdown

@syed-gilani I'm also good with these changes as they exist today, but there are conflicts with the main branch. @bsipocz I'm assuming a rebase is in order before final approval?

@syed-gilani

Copy link
Copy Markdown
Contributor Author

@tomdonaldson I resolved the conflicts and also removed the #folder38 tag from the TAP schema URL

@tomdonaldson

Copy link
Copy Markdown

@tomdonaldson I resolved the conflicts and also removed the #folder38 tag from the TAP schema URL

Thanks for these updates!

@bsipocz bsipocz merged commit 06db9c5 into astropy:main Feb 24, 2022
@bsipocz

bsipocz commented Feb 24, 2022

Copy link
Copy Markdown
Member

Thank you @syed-gilani!

@bsipocz bsipocz mentioned this pull request Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants