At the moment the httparchive.org App Engine code makes a call to gcloud for some dates in dates.py:
# When running the site locally and unable to authenticate with GCS bypass
# loading dates from GCS.
LOAD_DATES_FROM_GCS = True
try:
gcs = storage.Client()
gcs.get_bucket(GCS_BUCKET) # pragma: no cover
except (DefaultCredentialsError, RefreshError, Forbidden): # pragma: no cover
It tries to handle the case when you're not logged in and the storage.Client() call fails, but doesn't handle when you are logged in, but it's brittle and failed when we locked down the bucket and needed changes (see #1232)
But now we have an actual API end point, I'm hoping we can remove all this GCS logic completely and just make an API call from the front end (like we do for all the other data!) and simplify this code.
Here's the two things we use the GCS for (note GCS_BUCKET is set to httparchive):
def get_dates(): # pragma: no cover
...
bucket = gcs.get_bucket(GCS_BUCKET)
iterator = bucket.list_blobs(prefix="reports/20", delimiter="/")
and, for each of the individual reports:
def get_latest_date(dates, metric_id): # pragma: no cover
...
bucket = gcs.get_bucket(GCS_BUCKET)
for date in dates:
response = bucket.get_blob("reports/%s/%s.json" % (date, metric_id))
Hopefully those can be replicated in API calls?
At the moment the httparchive.org App Engine code makes a call to gcloud for some dates in
dates.py:It tries to handle the case when you're not logged in and the
storage.Client()call fails, but doesn't handle when you are logged in, but it's brittle and failed when we locked down the bucket and needed changes (see #1232)But now we have an actual API end point, I'm hoping we can remove all this GCS logic completely and just make an API call from the front end (like we do for all the other data!) and simplify this code.
Here's the two things we use the GCS for (note
GCS_BUCKETis set tohttparchive):and, for each of the individual reports:
Hopefully those can be replicated in API calls?