Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
GitHub Pages – Usage Limits (help.github.com)
165 points by carlchenet on Dec 21, 2016 | hide | past | favorite | 95 comments


I am the Product Manager for GitHub Pages. As has been mentioned multiple times here, the usage limits were not in response to a specific external event. The limits have been an internal policy (in one form or another) for as long as I've been involved (nearly 4 years now), and we chose to publicize them in a series of updates beginning early this summer.

This is a classic case of "this is why we can't have nice things". If you're using GitHub Pages for your personal site or to document/talk about the work you're doing on GitHub, in general, you should be fine, even if you get HN-level traffic every once in a while.

The problem comes when a small handful of users use GitHub Pages for things like automated version checks or configuration distribution, as a makeshift ad CDN for for-profit sites, pushing an automated build every minute, or to distribute large assets (for which things like Releases are a better fit).

When a user (ab)uses GitHub Pages in a way that threatens our ability to build or serve other users' sites, technically or practically, we need to step in, and those posted limits are intended to help set expectations as to what you should and shouldn't use GitHub Pages for. But again, the vast majority of the nearly 1M users that use GitHub Pages will never hear from us (and in most cases when they did, we proactively reached out and provided ample warning/offered to help).


> Additionally, GitHub Pages sites must refrain from:

> Pornographic content

How strict is this rule?

There has been some interesting open source AI projects related to NSFW images (eg. Yahoo_NSFW, Open_NSFW, MilesDeep). What is GitHub's policy regarding this projects? Could a GitHub page present results? What about a link to download a training dataset?

I also just noticed that Open_NSFW's web page is hosted on GitLab (https://open_nsfw.gitlab.io/). Would a page like this (which might be considered pornographic, depending on your interpretation) be allowed on GitHub pages?


See the section on "Sexually obscene content" in the GitHub Community Guidelines (https://help.github.com/articles/github-community-guidelines...). We purposely chose the word "obscene" and not "explicit" to allow for explicit but educational, scientific, or artistic content like this.


Clever, but respectful. I did not expect this response, but thinking about it again in context of the company it is coming from, I can see why you chose that response.


Thank you. Seriously. GitHub Pages is a great service. It's saved me in a whole number of smaller projects. The Usage Limits are surprisingly liberal considering it's a complimentary service! So, thank you and your team!


Do the "requests" mean page views or http requests? A single page view almost always has multiple http requests.

Github really needs to add https support for custom domains. It's 2016, https should be the default.

[Reposting my comment]


CloudFlare supports HTTPS for GitHub Pages, would definitely recommend it as although Namecheap is pretty good CloudFlare make everything DNS, CDN and security related soo easy for $0. (I'm not affiliated with them in any way :P )

https://blog.cloudflare.com/secure-and-fast-github-pages-wit...


The route from the user's browser to Cloudflare is encrypted (https), but the route between Cloudflare's servers and github pages is only http as Github does not support https for custom domains.

User <---https---> Cloudflare <---http---> Github pages


As long as your github page has https (it does) Cloudflare can do full HTTPS all the way through, and even strict to require a valid ssl cert (which github has).


I don't think this is correct for GitHub Page sites that use custom domains. See [1] and [2].

[1]: https://konklone.com/post/github-pages-now-supports-https-so...

[2]: https://github.com/isaacs/github/issues/156


Any tips on how to configure this? I'm pretty sure my setup has the problem that ploggingdev talked about.

I acknoledged the issue given, but considered it better that the content the user is accessing was hidden for their privacy - the link between Cloudflare and GitHub is backbone-of the internet stuff and has a whole different set of risks. Would be nice to plug it.



##### EDIT: The title for this link started as "GitHub Pages sets usage limits", but was changed to "What is GitHub Pages?" by an HN moderator, putting this post (and most of the discussion going on in here) completely out of context. I don't agree with this decision. Most HN readers know what Github Pages is - it's the new usage caps that are the news here. Those caps are new - I've been tracking Github Pages for years, and the only cap they previously set was the 1GB repo size (total changes ever) limit, and occasional heresay from people that used to work on it: https://www.quora.com/What-are-bandwidth-and-traffic-limits-...

https://neocities.org free plan, for comparison:

Storage: 100MB (1GB by end of January).

Bandwidth: 50GB (never been enforced, many sites are over and it's fine).

SSL: Yes (forced SSL starting Jan 1st).

Change limit: one per minute, never enforced.

Number of hits: No set limit, would be in the millions if I did.

Number of changes: No set limit, not enforced, probably never will be.

But our usage limits aren't going down - they're constantly going up as we upgrade and improve our infrastructure. In fact, due to some upgrades to our anycast CDN I'm going to raise the BW limit to 200GB right now. Merry Christmas.


Does Neocities support anything like webhooks, so people can do normal workflow of pushing to GitHub and having it be auto-deployed. Or even pushing to a Neocities remote would be fine too. I currently use a Raspberry Pi server that uses webhooks and it's very nice.


It doesn't have webhooks, but it has a full API. You could use CircleCI or another build tool to publish your site when you push to GitHub. This way you also get more control over the build process and can use tools other than Jekyll.

https://neocities.org/api

You could also create a webhook using their API and AWS API Gateway + Lambda.


There's an API right now. Github webhooks are coming, as is git push support. They were sidelined to facilitate the infrastructure upgrades so we could provide more bandwidth and storage. We'll get back to them.


GitHub pages don't support HTTPS so they're increasingly a no-go for me :(. So many endpoints are HTTPS only now that the mixed content warnings are too much hassle to deal with



"HTTPS is not supported for GitHub Pages using custom domains." - this is what they mean, they weren't specific.


A lot of people work around that by doing GitHub Pages + Cloudflare. It's free, and it gets you all the additional performance benefits that Cloudflare provides. Would probably also prevent you from running into any of these bandwidth limits.


This is what I do and it's great. CloudFlare supports HTTPS for GitHub Pages, would definitely recommend it as although Namecheap is pretty good CloudFlare make everything DNS, CDN and security related soo easy for $0. (I'm not affiliated with them in any way :P )

https://blog.cloudflare.com/secure-and-fast-github-pages-wit...


The Cloudflare setup is only secure from user to Cloudflare, not from Cloudflare to GitHub.


As someone who works on a SaaS project that has vastly different utilization by different customers, sometimes limits are more a way of ensuring that customers understand the expected usage of a product and don't get you into the situation where a customer is adversely impacting all the other customers and has the attitude of "you said it was unlimited!"

I've definitely seen postings here and elsewhere about CloudFlare having to deal with sites whose usage is so much that it impacts other customers. While it's in their ToS, it isn't stated as a limit and there's often a backlash.

I'm guessing that the GitHub "limits" are likely soft. As long as the usage doesn't start adversely impacting their systems, they might not care. But there are certainly cases where an individual customer might do something that they would want the ability to shut down because it is getting to the point that it causes reliability or performance issues and these limits give them cover to do that. User expectations were set that this isn't an unlimited free-for-all.

When you architect any system, you do so with certain usage patterns in mind. If you're running a SaSS project, customers might (quite innocently) not understand your intention for the service and use it in a different way that works for them, but puts your systems in pain and adversely impacts other customers.

http://githubengineering.com/rearchitecting-github-pages/

Based on that post, it seems that any individual file is located on one active fileserver and one standby fileserver. That means that if any one individual file becomes extremely popular, it has the potential of impacting other customers located on the same box. It seems reasonable that this architecture works well for the vast majority of GitHub Pages sites, but that it wouldn't work well for some users. Yes, there are ways of mitigating the problem. You could write a balancer that isolated high-volume customers. You could do a more complex replication scheme to more locations. But that's work not needed for the vast majority of users and their intention probably isn't to become a high-volume webhost.

Similarly, one can imagine a customer with a multi-GB repository making automated changes every few seconds. It's a little hard to imagine, but in a long tail of SaaS customers, people do all sorts of weird things. Now they're potentially having more data to be copied faster than they can actually copy (a spinning disk won't write multiple GB in under a few seconds). Or maybe they have a certain number of "copy workers" and they don't want loads of large jobs from one customer backing up the queue.

We've definitely had customers ask our support about usage that's 1.5x some of our "limits" and honestly we probably wouldn't have even noticed. The limits are there for us to set expectations that resources cost money and we've architected our systems in certain ways that we expect certain types of usage. But in a SaaS world, someone has always (usually innocently) found something they think is a perfect fit for what they want when it really isn't made for that.


Is that new? I'm pretty sure that was on that page last time I looked at it (sometime this summer) already.

EDIT: quora answers here https://www.quora.com/What-are-bandwidth-and-traffic-limits-... mention a rule change ~ half a year ago, but not entirely clear what was added. Wayback machine only has a few months old versions of the page that already include these rules. So this possibly was added this year, but not very recently.


Is this not a chance for github to make some money? I'm about to launch a site on github pages (simple html site, no backend) and there's a tiny chance of it going viral. If it were to reach the 100k limit for requests I'd be more than happy paying github to keep it hosted there.

I already have hosting with the likes of hostgator but they probably couldn't even handle 100k like github would.


For simple static pages (with no backend), I would highly recommend putting them on Nearly Free Speech (NFSN) [1] and using the free tier of CloudFlare in front of it. NFSN charges only for what you use, unlike hosts like hostgator and others that have a monthly payment of a few dollars, at the minimum. Excluding domain costs, it will cost you just pennies a year or a few dollars a year. You can also opt for cheaper bandwidth on the site if you wish.

Without ClouldFlare in front, your costs on NFSN may become quite high depending on the size of the content and number of requests.

[1]: https://www.nearlyfreespeech.net


I used to host some stuff on nearlyfreespeech. I liked the idea in principle, but keeping my account topped up and them charging me extra fees just to pay them became annoying. I switched to Amazon S3 and couldn't be happier. It was harder to setup but worth it.


> them charging me extra fees just to pay them became annoying

NFSN isn't charging you extra fees to make payments. They are being transparent about the fees everybody else hides in the total cost.


Not really. Look at their fee structure. It is more expensive than what Stripe, Paypal, and Square charge.


If you run a personal site on S3 and you get HN level traffic, you're gonna get a very bad surprise on your bank account at the end of the month.


With caching in front I wonder how big the risk is.


I'd qualify it as "not worth it".


To add to the list of alternatives, try netlify it has a free tier with no build limits and provides a pretty liberal and free open-source plan that has more feature than GitHub pages. https://netlify.com/open-source. Just throwing out there as well.

you can also connect your github repo to have a atomic deploys, which is more than you get from gh pages. https://www.netlify.com/blog/2016/08/11/from-unstable-to-rel...


There's also surge.sh which is pretty good for static sites. It's essentially Heroku but works with a single HTML file. It's free as well.


is this something like zeit/now?


Yep, looks like it! Looks like there are more limitations with zeit though. At least from the free tier.


For simple sites, I'd advise http://wordpress.com/ it's free and it can take any traffic you throw at it.


It's probably more of a headache than it's worth. Static site hosting is a low margin market that GitHub probably has no interest in entering.


Gitlab pages has a much higher repo size limit of 10GB, and they support custom domain https. I couldn't find anything about bandwidth limitations.

https://about.gitlab.com/2015/04/08/gitlab-dot-com-storage-l...


And they also have an order of mangitude less users..


Should that matter when you're running a website?


Do the "requests" mean page views or http requests? A single page view almost always has multiple http requests.

Github really needs to add https support for custom domains. It's 2016, https should be the default.

Worth mentioning that Gitlab pages supports https for custom domains. I am considering moving my blog to Gitlab. Anyone know about the resource limits and performance under Gitlab pages?


Use CloudFlare to proxy-cache your Github pages and it should be fine as long as you set correct caching rules.


Besides Cloudflare, there are these 2 github specific solitions https://github.com/schme16/gitcdn.xyz & https://rawgit.com/faq


Linky please. Tia


There was a great recent thread with lots of alternative static hosts discussed: https://news.ycombinator.com/item?id=13021722


Somewhat off topic, but related since it's not documented anywhere: you can put ads on your GitHub Pages install.

I'm not positive to the extent and what internal rules may apply. But out of curiosity I emailed the GH legal dept and they said it was fine (after checking around and getting back a couple weeks later).

So if you want to monetize your open source projects with Adsense or otherwise, that's actually an option! (Not that I have or ever will)


If you're getting over the bandwidth limit of 100GB or 100,000 requests per month, you could definitely afford to host your site elsewhere...


My blog gets about 19k uniques a month for which linode shows bandwidth as 39GB. I have a side project (https://www.findlectures.com) that got 60k page views this month, but only because it was linked in lifehacker & The Next Web. Between scrapers and HTTP requests that do feature detection, the number of requests would easily be 10x the number of users.


And since you mention Linode, which has no free tier, you must be able to afford to host your site somewhere other than Github Pages. So that works out quite well!


Agreed- it's just a hobby thing though.


Just wanted to say thats an AWESOME side project!!


Thanks!


My blog serves ~500k requests per month, mostly to bots. My pages generally have few external resources; with a typical CMS site this would be more like 5M requests.

(This is with ~15k pageviews-not-from-bots per month.)


Wow. That's actually quite low.

I would have topped both limits with any single article that went to the front page of HN :D

Note that this is only human traffic. Wordpress stats have good bot detection.


Spot on. They're PAGES for info and such, not free WEBSITES.

This is why we can't have nice things. People don't accept a no that's free, when clearly they should.


I don't think there is a useful distinction between PAGES and WEBSITES.

https://pages.github.com/ advertises Websites for you and your projects., and I imagine most of the examples they show at the top get more than 100k requests, since that's not all that much for e.g. documentation or demos for a reasonably popular project. (Although it is possible that the corporate-backed projects in there actually pay for Github, just not specifically for Pages).

Just to make it clear, I'm not criticising GitHub here: It seems like they uses this a guideline/reasoning help for when someone doesn't play nice, not as strict rules to enforce, which seems totally fair. They should absolutely do something against sites misusing this service. Demanding money for commercial use would also be possible option. (they of course are also free to make strict rules for all pages, but that would have a large impact for quite a few projects I imagine)


The point is, they could have called them gh-sites. Yes. Words matter. #Duh

If your "pages" are pulling that much traffic then put on your big boy (or girl) trousers, take off the training wheels and get a proper website. GH is not a hosting service. They're "kind" enough to offer free repos and that's not good enough? Y'all have lost your perspective.

Just like there's a difference between pages and sites, there's a difference between grateful and ungrateful. Y'all are like the guest that never leaves, never cleans up, never buys food, etc. That's just not reasonable.


This strikes me as a pretty useless distinction. Github frequently refers them as "sites", i.e. websites, and advertises using them to host Jekyll-generated blogs.

This is why we can have nice things. People always push up against limits as long as they aren't shut down.


Words? Who needs them? Let's just say anything and/or assume we heard whatever we wanted to hear. What could go wrong? Right?

gh-pages.

The intent of GH's offer is to assist and support OS projects. Not let some wiseass freeloader treat GH like a proper hosting company. Clearly ppl are taking advantage of that, but GH is the bad guy/gal?

gh-pages.

Geez. Why did I bother? Words? They have no value.


Actually, 100,000 requests per month is a little less than a req per second. Which means that if you want to have all the services that GH is offering, I guess you would pay quite a lot (relatively speaking).


> a little less than a req per second

There are 86400 seconds in a day. At a little under a request per second, you're going to hit 100k some time around 3am on the second day, depending on what you take "a little under" to mean.


100 K Requests are really not much. The page itself, some css, some scripts, some images and you hit the 100 K with less visitors.


It's actually about two requests a minute.


So this limit is technically per repo, right? I've created a small organization for myself to store mirrored government sites. None of them (so far) take up more than a gig, but I'll probably refrain from pushing giant mirrors on to Github. It's not that it isn't trivial to push these mirrors on S3, it's just that Github pages provides so many user-friendly endpoints.

A researcher recently asked me for help in mirroring one of the CMS.gov subdomains. Of course, by default, Github Pages serves up the mirror as if it were the real site:

https://wgetsnaps.github.io/marketplace.cms.gov/

But the researcher was obviously more interested in the documents published on the site than the site itself. So I told her to just checkout the subdirectories which are easy to nav via Github.com's standard repo listing: https://github.com/wgetsnaps/marketplace.cms.gov

But even better, I told her she could just download it as an archived zip, another endpoint that Github provides automatically, and peruse the file tree on her own operating system: https://github.com/wgetsnaps/marketplace.cms.gov/archive/mas...

And of course, there's the option to git clone it. These are all features that can be reproduced, but I use Github not to save money on AWS, but just because of how discoverable the repos are.


What they mean with 100.000 Requests / Month? Requests are normally every file access, not a user visit.

I have small OS-Projekt with ~20.000 Visitors / Month, 1 Pager with some Scripts and Images. This page has more than 100.000 Requests and less than 2 GB Traffic.

I know that github pages are not their primary use case, but then they should better drop the support completely or add a support plan. (i would pay for it)


For comparison, Fastmail provides static website hosting as part of their email service, and it has an 80,000 request or 2GB bandwidth limit per day. 100,000 requests or 100GB per month is extremely low.

Yes, I know that Fastmail is not free, but website hosting isn't their primary service, and I'm using it as a comparison for the amounts allowed.


Participate in an Interview on Experience with Different Git Tools

My name is Angela and I do researches on the user experience for different Git tools on the market. I’m kicking off a round of discussions with people who use Git tools. Ideally, I’d like to talk to people that sit on a team of 3 or more. If this is you, I would love to talk to you about your experience with <using> Git tools, or just some of the pain points that are keeping you up at night when doing your jobs.



I’ll just need 30 mins of your time, and as a token of my thanks to those that participate, I’d like to offer a US$50 Amazon gift voucher. 


If you’re interested, just shoot me an email with your availability over the next few weeks and we can set up a time to chat for 30 minutes. Please also include your timezone so we can schedule a suitable time (as I’m located in San Francisco). Hope to talk to you soon!

Cheers, 
Angela Guo [email protected]


Is this news? This has been around at least since June: https://www.quora.com/What-are-bandwidth-and-traffic-limits-...

Has anyone actually encountered enforcement of these limits?


I've seen multiple amateur projects handle the Reddit hug like champs because they were static front-ends hosted on GitHub Pages with cautious AJAX calls to backend APIs. With a limit on 100,000 requests per month (which could be consumed in a few hours in such case) I guess this era is over.


This has been around for a long time (see other answers in this thread), so if you've seen any site handle the Reddit hug in the last year, no reason anything should change.


There's always neocities. You only get 50GB free a month, but paid seems cheap, and they seem to have their hearts in the right place.

https://neocities.org/supporter


>GitHub Pages sites have a limit of 10 builds per hour.

I didn't know this. It seems one should work locally with Jekyll and only push once everything is done and tested.

Couldn't this be relaxed via/for incremental builds?


Github Pages is a special type of hosting. It's first ever transparent hosting. What you see is what is in the repo. By wathching the repo people can be quite sure they won't be hacked or served special JS they dont expect. That's why I'd love to see paid plans.


You can say the same thing for every "static" hosting. As long as you don't run custom code server-side, every line of code run on the client is readable by anyone whichever hosting you use. GitHub Pages offers no more guarantees than "no (custom) code is run on the server". Also I'm pretty confident the 99% of users won't look at the code in advance anyway, even if they can understand what it does.


Github offers you to "watch" a repo out of box, and if many users do it, much harder to backdoor.


Assuming of course you trust GitHub.. And you use HTTPS. And you trust HTTPS. And you read the entire code and verified there are no backdoors whatsoever. None. At all.


Goes without saying.

But i would trust bitcoinwallet at github.io more than a standalone web app


hum... that's strange move after Gitlab releases the "pages" feature also on their product. Definitely want to see more something like free tier and being able to have more if having a paid plan.'


Again, as pointed out multiple times in this thread, this isn't new.


At least they have enabled https... I am using to for a longer time and didnt know it was possible until now.


But not for custom domains, sadly.


Those are easy to do with a (free) Cloudflare account


There's still the connection from Cloudflare to the origin server, which won't have authentication.


Couldn't Cloudflare just connect over HTTPS using the regular github.io domain?


How would Github "enable" HTTPS for custom domains? Do you want them to provide you with a free certificate for every custom domain you add?

On the other hand, if you _do_ possess a cert for your custom domain, then HTTPS works just fine [1].

[1]: https://rishav.js.org


So you're using Cloudflare. Do you have Strict SSL enabled? If not, you don't have authenticated encryption. Hiding the problem is not a solution. If GitHub doesn't let you set a custom certificate, Cloudflare isn't a solution because they don't have a way to pin a certificate rather than relying on CA authentication.


Yes, I believe you can enforce HTTPS using Cloudflare. I don't know about it because I don't manage the domain, I simply use the subdomain.


> Do you want them to provide you with a free certificate for every custom domain you add?

Yes.


> Do you want them to provide you with a free certificate for every custom domain you add?

Various competitors have started doing that after Let's Encrypt was created, so yes?


Wow, is there any way to get more views? I can pay, i need more than 100k


I wonder if 2048 respected those limits.


They're called Pages, not Websites. If you're hitting these caps it's time to revisit your needs and your vendor/provider.


The new restrictions have been put in place not long after the Bloomberg reported that GitHub suffers from a loss of $66m in the first 3 quarters this year -> https://www.bloomberg.com/news/articles/2016-12-15/github-is...

The title should be update to reflect the what's happened - the changes.


As mentioned in other comments, this isn't new, it's been around at least since June https://www.quora.com/What-are-bandwidth-and-traffic-limits-...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: