Skip to content

Optimize getting config item values #17461

Merged
pllim merged 4 commits into
astropy:mainfrom
saimn:config
Dec 10, 2024
Merged

Optimize getting config item values #17461
pllim merged 4 commits into
astropy:mainfrom
saimn:config

Conversation

@saimn

@saimn saimn commented Nov 26, 2024

Copy link
Copy Markdown
Contributor

Fix #12504 (I think, doing better might be possible but not sure without breaking some feature of our complicated configuration system):

As explained there, getting a config item is quite slow, values being stored within the ConfigObj object . And io.fits accesses config items when e.g. accessing the value of a card, this means a lot of config item calls for tables especially with many columns.

So here I'm adding a cache on the ConfigItem descriptor, allowing to get the key much faster while still storing its value on the ConfigObj object.

Description

This pull request is to address ...

Fixes #

  • By checking this box, the PR author has requested that maintainers do NOT use the "Squash and Merge" button. Maintainers should respect this when possible; however, the final decision is at the discretion of the maintainer that merges the PR.

@saimn saimn added this to the v7.1.0 milestone Nov 26, 2024
@github-actions

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see instructions for rebase and squash.
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
  • Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
  • Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

@saimn

saimn commented Nov 26, 2024

Copy link
Copy Markdown
Contributor Author

Forgot to add some timings.
Before:

In [2]: %timeit fits.conf.enable_record_valued_keyword_cards
2.69 μs ± 502 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [3]: %timeit fits.getdata("astropy/io/fits/tests/data/tb.fits")
1.73 ms ± 338 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [2]: %timeit fits.getdata("astropy/io/fits/tests/data/memtest.fits")
11.5 ms ± 53.6 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

After:

In [2]: %timeit fits.conf.enable_record_valued_keyword_cards
55.2 ns ± 0.163 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [3]: %timeit fits.getdata("astropy/io/fits/tests/data/tb.fits")
1.09 ms ± 6.35 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [2]: %timeit fits.getdata("astropy/io/fits/tests/data/memtest.fits")
9.17 ms ± 11.2 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

@saimn saimn requested a review from pllim November 26, 2024 20:53
Comment thread astropy/config/configuration.py Outdated
Comment on lines +331 to +335
try:
val = self.value
except AttributeError:
val = self.value = self()
return val

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this?

Suggested change
try:
val = self.value
except AttributeError:
val = self.value = self()
return val
if not hasattr(self, "value"):
self.value = self()
return self.value

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try/except has no cost if there is no exception, so I think it is good to use it (hasattr uses it internally, but now you do have the cost of the if statement). I'd try to shave off another few ns with

try:
    return self.value
except AttributeError:
    self.value = val = self()
    return val

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah try/except is commonly used in such cases both for perf and also hasattr can have some side effects so simpler is better here.

Comment on lines +406 to +409
try:
del self.value
except AttributeError:
pass

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe like this instead? Though I don't see how this is equivalent to self.set(self.defaultvalue).

Suggested change
try:
del self.value
except AttributeError:
pass
if hasattr(self, "value"):
del self.value

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For self.set(self.defaultvalue) it's annoying but I don't remember 😬 (I really should open PRs faster!)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that the try/except form is more robust to parallelism (which might matter in the wake of free-threaded Python ?)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as hitting except is rare...

@@ -0,0 +1 @@
Optimize getting config item values.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we roughly quantify this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a noticeable change if you get a few config values but it is for io.fits which uses config items when e.g. getting card values. So when reading many keywords or a table with many columns it starts to matter.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know FITS uses this... 🙈

@pllim pllim added the benchmark Run benchmarks for a PR label Nov 26, 2024

@mhvk mhvk left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, though I haven't really looked at this code before, so it would be good for someone else to check too. Benchmarks certainly look good (the one bad one should be an unrelated fluke). But need some precommit fix or so.

Comment thread astropy/config/configuration.py

@mhvk mhvk left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, looks all good to me.

@pllim pllim left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mhvk

mhvk commented May 13, 2025

Copy link
Copy Markdown
Contributor

Unfortunately, this caching of config items introduced a bug, in that reload_config does not reset the values cached here - see #18138.

pllim added a commit to pllim/astropy that referenced this pull request May 13, 2025
pllim added a commit that referenced this pull request May 14, 2025
* Revert "Optimize getting config item values  (#17461)"

This reverts commit 49765f4.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark Run benchmarks for a PR config Performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ConfigNamespace is slow

4 participants