Missions mast by syed-gilani · Pull Request #2095 · astropy/astroquery

syed-gilani · 2021-06-24T15:54:24Z

This PR includes a class that makes POST API call to https://masttest.stsci.edu/search/hst/api/v0.1/search with json data that contains the search criteria. See a sample of json data below:

{"target":[""],"radius":3,"radius_units":"arcminutes","select_cols":["ang_sep","sci_aper_1234","sci_central_wavelength","sci_data_set_name","sci_dec","sci_actual_duration","sci_spec_1234","sci_hlsp","sci_instrume","sci_preview_name","sci_pep_id","sci_ra","sci_refnum","sci_release_date","scp_scan_type","sci_start_time","sci_stop_time","sci_targname"],"user_fields":[],"conditions":[{"sci_obs_type":"all"},{"sci_aec":"S"},{"sci_instrume":"stis,acs,wfc3,cos,fgs,foc,fos,ghrs,hsp,nicmos,wfpc,wfpc2"},{"sci_data_set_name":""},{"sci_pep_id":""},{"sci_pi_last_name":""},{"sci_actual_duration":""},{"sci_spec_1234":""},{"sci_release_date":""},{"sci_start_time":""}],"limit":5000,"offset":0,"sort_by":[],"sort_desc":[],"skip_count":false}

It then converts the json response from the API call into astropy Table.

pep8speaks · 2021-06-24T15:54:27Z

Hello @syed-gilani! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-02-24 03:44:33 UTC

ceb8

First, I recommend you change this to a WIP PR until this service gets deployed to ops at STScI, as it cannot be added to astroquery before then.

I've mentioned a few specifics in line, but before I really go through it in detail I want to address the larger picture items that I think need to be addressed.

First, I think we may have had a misunderstanding with regards to the overall design of the new functionality. I am confused as to why the user-facing class is called "Datasets" and the non-user-facing class is called "MissionMast". My understanding had been that the user-facing class would be astroquery.mast.MissionMast or similar, because that is more descriptive than "Datasets" and also matches the nomenclature we are using for the mission-mast interface itself.
In terms of the MissionsSearchAPI class, it is not clear to me why this cannot be folded into the existing ServiceAPI, the structure of the url looks pretty similar, and it would be preferable to generalize the ServiceAPI to also function for the mission interfaces if possible.

For the DatasetsClass, this interface does not follow the API specs required by astroquery. The functions should be along the lines of those provided by the Observations class, "query_object", "query_region", "query_criteria" etc. Also the functions should take advantage of the **kwargs option in python so that the user can supply filters as individual arguments rather than a single json dictionary.

Additionally, while you've called this "Datasets" generally, it only accesses the HST mission search, so there needs to be a way for the user to select the specific mission to search within, and a way to easily expand the code for future mission searches.

For the documentation, you've added the new search capabilities in the middle of the Observations section. There should be a new section called "Mission Searches" or whatever, and it needs to explain more clearly why a user would be interested in these searches and give links to the relevant STScI documentation.

codecov · 2021-06-25T21:02:19Z

Codecov Report

Merging #2095 (4f9bea3) into main (f99e1ba) will increase coverage by 0.08%.
The diff coverage is 80.61%.

❗ Current head 4f9bea3 differs from pull request most recent head 200af24. Consider uploading reports for the commit 200af24 to get more accurate results

@@            Coverage Diff             @@
##             main    #2095      +/-   ##
==========================================
+ Coverage   62.84%   62.93%   +0.08%     
==========================================
  Files         130      131       +1     
  Lines       16841    16917      +76     
==========================================
+ Hits        10584    10646      +62     
- Misses       6257     6271      +14

Impacted Files	Coverage Δ
astroquery/mast/missions.py	`79.45% <79.45%> (ø)`
astroquery/mast/services.py	`79.56% <84.00%> (+1.20%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f99e1ba...200af24. Read the comment docs.

ceb8

Overall the functionality is quite good, although the code needs some tweaking.

The mocked tests look good. Remoted tests will also be necessary.

The docs still reflect the old Datasets functionality, so they need to be updated. Also the description of this PR needs to be updated to reflect what's in it now.

Just a suggestion, but since this is part of a larger MAST effort it might be a good idea to make an issue with the list of things MAST plans to implement and checky boxes so that users can see what's coming and track progress.

ceb8 · 2021-07-23T15:11:10Z

            col_type = np.int64
            ignore_value = -999 if (ignore_value is None) else ignore_value
-        elif col_type == "double" or col_type == "float" or col_type == "DECIMAL":
+        elif col_type == "double" or col_type.lower() == "float" or col_type == "DECIMAL":


I wonder if it would be safer to make all of our comparisons in this function case-insensitive.

The to_lower() here addresses this one case. I'm guessing the comment was intended to be broader, perhaps getting the lower case version of col_type at the beginning of the loop, then having all the comparisons be against lower case, e.g., "varchar", "null", "decimal".

I do still worry about all the variations in capitalization for col_type. It would be nice if we were consistent on the server, but that's not the case. Creating or pointing to documentation for the possible values I think could really help. I also think treating all values as lowercase here would be a little safer.

I don't see a trace of discussion about this review comment, neither changes to the code. Please don't click "resolve conversation" without actually resolving it. If discussion happened offline, then leave a summary of it at the relevant place. Thanks!

OK, I see the to lower has been moved to the top of the loop, so I am going to mark this conversation as resolved.

tomdonaldson

Thanks for this work. I think it's looking quite good, but I did have a few questions/comments.

Since I'm unfamiliar with the existing code relating to catalogs, I'd also like to do a deeper dive to explore some of the usage and test coverage, but I didn't want to delay these initial comments for that.

tomdonaldson · 2021-09-28T12:30:47Z

            col_type = np.int64
            ignore_value = -999 if (ignore_value is None) else ignore_value
-        elif col_type == "double" or col_type == "float" or col_type == "DECIMAL":
+        elif col_type == "double" or col_type.lower() == "float" or col_type == "DECIMAL":


The to_lower() here addresses this one case. I'm guessing the comment was intended to be broader, perhaps getting the lower case version of col_type at the beginning of the loop, then having all the comparisons be against lower case, e.g., "varchar", "null", "decimal".

tomdonaldson · 2021-09-28T12:35:57Z

            ignore_value = None
        # making type adjustments
-        if col_type == "char" or col_type == "STRING":
+        if col_type == "char" or col_type == "STRING" or 'VARCHAR' in col_type or col_type == "NULL":


It could be useful to have a brief comment (maybe in the doc string?) explaining the possible values of col_type and which service(s) they might come from.

please do address this, a comment in the code would be sufficient I think

still stands

tomdonaldson · 2021-09-28T13:10:30Z

+.. code-block:: python
+
+                >>> from astroquery.mast import Datasets
+                >>> params = {"target": ["40.66963 -0.01328"],


The documentation seems a little incomplete to me. As a small example, this code sample doesn't actually do a query. In general as a user, I'd be looking for the following things:

At a high level, how do the missions searches differ from the others (i.e., when might I choose to use the missions search?).

What queries are available to me?

What do each of the keys in the params dict do? (And do astroquery users need to know about all the params shown here?)

What is the syntax for specifying "conditions"?

(A pointer to) documentation about what columns are available, along with their types so I know what conditions would be applicable.

Can I do a follow-up query to retrieve data products?

bsipocz · 2021-09-28T16:06:51Z

@tomdonaldson - maybe you want to ping Erik yet again to add @syed-gilani to the org, so we don't need to approve their each and every commit...

eteq · 2021-09-30T15:10:38Z

Just sent an invite to @syed-gilani to the org. I did not add them to any of the teams, as I think just org membership is enough to not require the approval. But if there's a team I should add them to, I can do that. (Although I think y'all have admin powers over the astroquery teams?)

tomdonaldson

Thanks for the updates. This is looking really good.

I've added some minor suggestions for the docs and code coverage.

tomdonaldson · 2021-10-11T16:02:45Z

+                >>> missions.service
+                'search'
+
+missions object can be used to search meta data usiong region coordinates. the kwargs can be


Suggested change

missions object can be used to search meta data usiong region coordinates. the kwargs can be

The missions object can be used to search meta data using region coordinates. the kwargs can be

tomdonaldson · 2021-10-11T16:02:54Z

+
+missions object can be used to search meta data usiong region coordinates. the kwargs can be
+used to specify other filters like selec_cols and sort_by. The available column names to filter and
+their description for can be found here https://vao.stsci.edu/missionmast/tapservice.aspx/tables#folder38


Suggested change

their description for can be found here https://vao.stsci.edu/missionmast/tapservice.aspx/tables#folder38

their descriptions can be found here https://vao.stsci.edu/missionmast/tapservice.aspx/tables#folder38 (xml)

tomdonaldson · 2021-10-11T16:03:06Z

+                'search'
+
+missions object can be used to search meta data usiong region coordinates. the kwargs can be
+used to specify other filters like selec_cols and sort_by. The available column names to filter and


Suggested change

used to specify other filters like selec_cols and sort_by. The available column names to filter and

used to specify output characteristics like select_cols and sort_by. The available column names to filter and

tomdonaldson · 2021-10-11T16:09:16Z

+
+.. code-block:: python
+
+                >>> results = missions.query_criteria(select_cols=["sci_stop_time", "sci_targname", "sci_start_time", "sci_status"], sort_by=['sci_targname'])


Was this example meant to include some constraints on column values? It would be good to see a numeric and discrete example, I think.

tomdonaldson · 2021-10-11T16:16:11Z

+
+        # adding additional user specified parameters
+        for prop, value in kwargs.items():
+            params[prop] = value


This seems fairly fundamental. Should there be test case(s) for params[prop] = value?

tomdonaldson · 2021-10-11T16:16:23Z

+            params["radius_units"] = 'arcminutes'
+
+        if not self._service_api_connection.check_catalogs_criteria_params(criteria):
+            raise InvalidQueryError("At least one non-positional criterion must be supplied.")


If you're ok with missing unit test, I will add it in one of the future updates to this interface.

tomdonaldson

I think this looks good now. Test coverage improvements can be handled in a subsequent PR.

tomdonaldson · 2021-10-13T18:32:02Z

@bsipocz I think this is ready for a final look. We've discussed the remaining coverage issues and agreed to file an internal ticket to address them in a future PR. Please advise if you would also like to track that with a github issue.

bsipocz

Thanks @syed-gilani! This is coming along nicely.

Some of my comments are extreme nitpicking, but there are a few ones that should be addressed along with comments left unaddressed in previous reviews from others.

bsipocz · 2021-10-13T20:39:58Z

+        service_dict = {self.service: {'path': self.service, 'args': {}}}
+        self._service_api_connection.set_service_params(service_dict, f"{self.service}/{self.mission}")
+
+    def set_service(self, service):


I may be missing something, so I like to see why this, as well as set_mission was done this way, rather than handling it as part of __init__.

I think these (both set_service() and set_mission()) are not required anymore as the service can be set at initialization.

syed-gilani · 2021-10-19T14:39:57Z

@bsipocz For making all searchable fields visible in the query_criteria, I will have to add more than 50 arguments to the query_criteria method. There is an endpoint that returns the list of all columns that can be used to filter the search, This is the URL https://mast.stsci.edu/search/util/api/v0.1/column_list. I am not sure if adding more than 50 arguments to a method is a good idea

syed-gilani · 2021-10-19T16:07:29Z

@bsipocz Another point to consider is that the name of searchable fields will change from mission to mission. Right now, We only support hst mission but we are working on classy mission and it will be released soon. If we leave it as conditions then we don't have to provide multiple query_citeria for each mission with a different set of arguments

bsipocz · 2021-10-19T17:27:50Z

I will have to add more than 50 arguments to the query_criteria method

Thanks, that's good to know and I agree that in that case it's not something practical. I suppose it would be useful to add a comment about it in the code, so if anyone else also runs into it (or uses mast as an example implementation) they know that it's more the exception for this case.

Also, I wonder whether it would make sense, or just complicate things to check the validity of kwargs. As I said one of my main concerns is that invalid keywords can be passed this way which may result cases where the user assumes a certain behaviour but in fact, their input is totally ignored by the code. Addressing this goes beyond this PR.

bsipocz · 2021-10-20T17:45:26Z

+        service_dict = {self.service: {'path': self.service, 'args': {}}}
+        self._service_api_connection.set_service_params(service_dict, f"{self.service}/{self.mission}")
+
+    def set_service(self, service):


I think these (both set_service() and set_mission()) are not required anymore as the service can be set at initialization.

bsipocz · 2021-10-20T19:55:49Z

            ignore_value = None
        # making type adjustments
-        if col_type == "char" or col_type == "STRING":
+        if col_type == "char" or col_type == "STRING" or 'VARCHAR' in col_type or col_type == "NULL":


still stands

bsipocz · 2021-10-20T20:17:10Z

            col_type = np.int64
            ignore_value = -999 if (ignore_value is None) else ignore_value
-        elif col_type == "double" or col_type == "float" or col_type == "DECIMAL":
+        elif col_type == "double" or col_type.lower() == "float" or col_type == "DECIMAL":


I don't see a trace of discussion about this review comment, neither changes to the code. Please don't click "resolve conversation" without actually resolving it. If discussion happened offline, then leave a summary of it at the relevant place. Thanks!

syed-gilani · 2021-10-22T13:08:20Z

@bsipocz All things identified in the feedback have been resolved. Thanks for all the feedback and let me know if there is anything else you want me to change before this PR can be merged.

syed-gilani · 2021-10-27T16:06:01Z

@bsipocz The last outstanding issue that required making mission and service keyword only argument was fixed a few days ago. Is there anything else that you want changed before this PR can be approved?

jaymedina · 2021-11-08T20:13:52Z

+            Default 3 degrees.
+            The string must be parsable by `~astropy.coordinates.Angle`. The
+            appropriate `~astropy.units.Quantity` object from
+            `~astropy.units` may also be used. Defaults to 0.2 deg.


It says here Defaults to 0.2 deg but the default is 3 correct?

jaymedina · 2021-11-08T20:17:19Z

+    @class_or_instance
+    def query_criteria_async(self, **criteria):
+        """
+        Given an set of filters, returns a list of catalog entries.


Agreed, I think people might get confused by what filters means in this context.

jaymedina · 2021-11-08T20:20:46Z

+            Default 3 arcmin.
+            The string must be parsable by `~astropy.coordinates.Angle`.
+            The appropriate `~astropy.units.Quantity` object from
+            `~astropy.units` may also be used. Defaults to 0.2 deg.


The same Defaults to 0.2 deg is here

jaymedina · 2021-11-08T20:25:06Z

Hi @syed-gilani , I'll be taking over from Clara in maintaining astroquery's MAST module so I took a look through your PR and it looks like a really useful interface. I had a couple of comments and questions that you can look at before the merge. Functionally this is looking good. Thanks for your time.

Jenny

bsipocz · 2021-11-24T08:52:19Z

I started to go over this again and noticed that there are a few still unaddressed review comments (going back to all the way back to the first reviews from Clara). Some of those may not be relevant anymore, but I would find it useful if any trace of that would be left in the review comments, e.g. comments on the review asks for a missions parameter for given methods. Now, I see we have a missions kwarg for the class installation. So the method parameters may not be relevant any more, but it would enormously help my review task if I didn't have to go back to the commit history level to do a detective work to see which came first, the review or the class kwargs.

Anyway, this has lingered here long enough, so let's say. If both @tomdonaldson and @jaymedina signs-off on it, I'll merge with the understanding the follow-up PRs will come that adds e.g. the tests and any more outstanding comments from this review.

bsipocz · 2021-11-24T09:02:44Z

+            Valid criteria are coordinates, objectname, radius (as in `query_region` and `query_object`),
+            and all fields listed in the column documentation for the mission being queried.
+            Fields that can be used to match results on criteria. See the TAP schema link below for all field names.
+            https://vao.stsci.edu/missionmast/tapservice.aspx/tables#folder38


This is not a requirement for the PR, but @tomdonaldson, I wonder maybe there is a human-readable version of this (or a link to a tool that does that, with that XML preloaded into it)? (e.g the workflow of astroquery users to look at the XML of that link to dig up field names feels a bit clunky to me)

I totally agree with this comment. Unfortunately we don't support a clean human-readable format for this, and I'd be hesitant to point to a 3rd-party formatter.

@syed-gilani it's not obvious to me what the #folder38 part of the URL does. Are there environments where that would limit what parts of the TAP Schema are shown?

jaymedina · 2022-01-31T16:26:38Z

This looks good. @syed-gilani can you give us an update on the results from running test_mast.py and test_mast_remote.py ? Good to be sure that all the unit tests are passing. Other than that, I approve of these changes and I think it's ready for a merge after a final review from @bsipocz

tomdonaldson · 2022-02-01T02:37:18Z

@syed-gilani I'm also good with these changes as they exist today, but there are conflicts with the main branch. @bsipocz I'm assuming a rebase is in order before final approval?

syed-gilani · 2022-02-01T16:00:45Z

@tomdonaldson I resolved the conflicts and also removed the #folder38 tag from the TAP schema URL

tomdonaldson · 2022-02-01T21:50:21Z

@tomdonaldson I resolved the conflicts and also removed the #folder38 tag from the TAP schema URL

Thanks for these updates!

…g conditions

… type conversion

Removed the html tag from the TAP schema link

…criptions for a mission

bsipocz · 2022-02-24T03:58:34Z

Thank you @syed-gilani!

ceb8 requested changes Jun 25, 2021

View reviewed changes

Comment thread astroquery/mast/tests/test_mast.py Outdated

Comment thread astroquery/mast/__init__.py Outdated

Comment thread astroquery/mast/__init__.py Outdated

bsipocz marked this pull request as draft June 29, 2021 23:17

ceb8 requested changes Jul 23, 2021

View reviewed changes

syed-gilani marked this pull request as ready for review September 21, 2021 14:34

syed-gilani requested a review from ceb8 September 21, 2021 14:37

tomdonaldson suggested changes Sep 28, 2021

View reviewed changes

tomdonaldson suggested changes Oct 11, 2021

View reviewed changes

bsipocz added the mast label Oct 11, 2021

tomdonaldson approved these changes Oct 13, 2021

View reviewed changes

bsipocz reviewed Oct 14, 2021

View reviewed changes

bsipocz requested changes Oct 20, 2021

View reviewed changes

jaymedina suggested changes Nov 8, 2021

View reviewed changes

bsipocz reviewed Nov 24, 2021

View reviewed changes

jaymedina approved these changes Dec 1, 2021

View reviewed changes

syed-gilani and others added 26 commits February 23, 2022 19:43

fixed code style issues

db8f581

updted documentation with more examples and fixed an issue with addin…

1949f0f

…g conditions

changed class name from Missions to MissionsMast

977097e

replaced relative imports with absolute imports

98b63a2

fixed indentation issue and added a missing import

9d13269

fixed indentation and updated doc string

cc11223

fixed a failing test

eda0e66

removed the set methods for service and mission

b3a9594

replaced relative imports with absolute imports

858c916

changed class name from MissionsMast to MastMissions

4ad97b8

made all data type comparisons in lower case and added comments about…

975ae1a

… type conversion

fixed a failing test

27de995

added kwargs to init method of MastMissions class

ddffbe5

made mission and service keyword only arguments

213c3a3

fixed a typo in argument name

14587a6

fixed doc string and added additional assertins in test

8b9da2b

added entry to changelog

6c65884

Update missions.py

cdb0e30

Removed the html tag from the TAP schema link

updated a docstring

2660b2c

added a unit test for checking if KeyError is raised

d122e3d

fixed a doc string typo

5e3c5dd

added a method to retrieve all searchable columns names and their des…

5249cd2

…criptions for a mission

fixed code styling issues

6c8c35a

updated documentation

34e4c16

removed references to functions that no longer exist

cb1710c

Updating mailmap [ci skip]

200af24

bsipocz force-pushed the missions_mast branch from c5788ed to 200af24 Compare February 24, 2022 03:44

bsipocz merged commit 06db9c5 into astropy:main Feb 24, 2022

bsipocz mentioned this pull request Feb 24, 2022

Mast doc cleanup #1975

Merged

	missions object can be used to search meta data usiong region coordinates. the kwargs can be
	The missions object can be used to search meta data using region coordinates. the kwargs can be

	their description for can be found here https://vao.stsci.edu/missionmast/tapservice.aspx/tables#folder38
	their descriptions can be found here https://vao.stsci.edu/missionmast/tapservice.aspx/tables#folder38 (xml)

	used to specify other filters like selec_cols and sort_by. The available column names to filter and
	used to specify output characteristics like select_cols and sort_by. The available column names to filter and


		.. code-block:: python

		>>> results = missions.query_criteria(select_cols=["sci_stop_time", "sci_targname", "sci_start_time", "sci_status"], sort_by=['sci_targname'])

Uh oh!

Conversation

syed-gilani commented Jun 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pep8speaks commented Jun 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2022-02-24 03:44:33 UTC

Uh oh!

ceb8 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jun 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ceb8 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tomdonaldson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bsipocz commented Sep 28, 2021

Uh oh!

eteq commented Sep 30, 2021

Uh oh!

tomdonaldson left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

syed-gilani commented Jun 24, 2021 •

edited

Loading

pep8speaks commented Jun 24, 2021 •

edited

Loading

ceb8 left a comment •

edited

Loading

codecov Bot commented Jun 25, 2021 •

edited

Loading