Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

When you use copilot integrated in the editor, does microsoft collect all source code data on your project or only context used to perform the completions?


When you sign up for CoPilot, there's a settings section on Github for it. One option you can toggle is "Allow GitHub to use my code snippets for product improvements *".

Context still needs to be processed, so surrounding line, block, and a couple open tabs gets piped into the prompt.

And here's a quote from the privacy page.

> Depending on your preferred telemetry settings, GitHub Copilot may also collect and retain the following, collectively referred to as “code snippets”: source code that you are editing, related files and other files open in the same IDE or editor, URLs of repositories and files path.


What is the default?


To share. But it's a very obvious check-box.


Concern over this is the #1 reason I have not yet tried to use Copilot. For my hobby projects I don't care enough to pay for it. And if it's phoning home proprietary code, I can't allow that to happen.


Eh, as long as my employers don't care (they don't), I don't care. I have no illusions that my code/our code will give Microsoft any valuable training data it couldn't trivially get elsewhere.


Mine does, and therein lies my issue.


You can always use https://github.com/salesforce/CodeGen . But it does require managing the model hosting. You can use fauxpilot to mimic copilot functionality https://github.com/fauxpilot/fauxpilot


IMO Copilot for Business has a very reasonable data collection policy. They discard any code snippets once the suggestion is returned.

https://github.com/features/copilot


If that's the case, would co-pilot be useful anyway? Or are you off the range where suggestions wont help?


In theory there are no rules about importing code, beyond the usual licensing issues. But people use SO and such all of the time, right? If one *really* wanted to do a global audit of improperly imported code, we'd all have bigger problems. So from that perspective it's status quo.

But I don't want to be the person caught uploading proprietary code to another company's servers.

It's not a major issue, and I doubt it'd ever be a practical problem. But fear of punishment keeps me away.


It’s worth it even for hobby projects, imo. It reduces the time spent on mundane tasks and allows you to think at a higher level and just move faster. Maybe you achieve a level of zen from implementing utility level code, similar to how some people might still write assembly code, but otherwise it’s a valuable tool/skill to learn.

Tangentially, I think there’s some fear associated with adopting AI tools, perhaps because developers feel like their skill sets are being displaced. And they are but there’s headroom e.g. assembly programmers learned C. There seems to be some post-hoc rationalizations being put forth to avoid that fear, but my sense is that developers who don’t cultivate this new skill set will fall behind.


I'm being reminded of a close friend of mine who is a car mechanic. In recent years the fraction of BEV and PHEV among new cars has risen to ~20% which absolutely will influence his job and will require new skills of a different kind.

Yet, despite the obvious evidence, he is unwillingly to even acknowledge the possibility that this is happening and refuses to research what it could mean to him (which may be very little).

I never quite understood why. Certainly just keeping in touch with the world wouldn't hurt right?

With the rise of AI, I think I get it. There's a part of me that is scared to shit about the prospect of being made redundant in the near future with all my acquired skill being worthless in this new world. The temptation to put my head into the sand and hope it "blows" over is strong.

I've resigned myself to never become like my friend and consequently have recently shelled out for a year of Copilot. My thinking is that at worst it's 100€ wasted and at best I'm not blindsided by what is coming anyway.

The reality will probably fall somewhere on a middle ground where there are still jobs to be found.


> my sense is that developers who don’t cultivate this new skill set will fall behind

That might be true but it's an easy skillset to pick up compared to programming. The bigger danger is that new developers will lean on AI so much that they do not pick up the fundamentals of programming in which case they will definitely be left behind.


Many, probably. However, the curious types will likely be further enhanced by AI. I've never been one to take code at face value, and I have been enjoying sessions with ChatGPT asking all sorts of questions about some of the stuff it produces. The answer is usually sufficient, and in cases where it's not, I've been given enough background context to know where to find the answer online or in books.

Honestly, I've seen myself master many more additional things since I've started including it in my daily routine.


the result of this will be similar to hiring infosys

hundreds of thousands of lines of buggy incomprehensible boilerplate that doesn't work on anything but the easy cases

then you have to rip the entire thing apart and start again with people that know what they're doing


Can you describe how you use it? I struggle to imagine how it would even be done. Ie do you write prompts? Just code as normal but frequently hit a "copilot" button? etc

Though i do wonder if it'll improve my ability to read code. PRs are a pain because i find it easier to write than read. I'd pay for Copilot in a heartbeat if it was good at spotting PR errors/etc.


Just type your code in the editor. And it offers auto complete suggestions. Sometimes it will complete the entire function based on the function name or a comment. Sometimes it'll just guess the function you want to write, without you typing anything at all. (Turns out a lot of code is rather predictable).

By experience though it’s best to go line by line rather than accepting whole function autcompletes.

For me, I found incredibly useful for generating test cases. It will type out test functions for various conditions, stuff that is normal really tedious to code.

Sometimes is eerie, how how well it knows exactly what next line should be. Countless times it filled in an important detail that I hadn’t thought of.

It’s not perfect at all, sometimes it goes off on tangents or writes incorrect code.

I don’t think you even have to pay for copilot. At least it’s free for me.


They have a limited trial or company memberships afaik.

It costs 10$/month 100$/year for individual users.


~~That's weird because I don't pay anything.~~

EDIT: GitHub Copilot is free to use for verified students, teachers, and maintainers of popular open source projects


The Adobe model of letting students and schools train on it and then demand employers buy the subscription when the graduate.


I use GitHub so not really a concern for me, they have my code already.


IIRC they didn’t train on private repos though, so using copilot in a private (github) repo will potentially open up your proprietary code to being used in that way.


No, the model doesn't train on your private code (which is good but also somewhat limiting as in my experience it doesn't provide useful answers that are very specific to your codebase); it's good for generic code though and saves time looking stuff up.


According to my subscribing and testing it out with the Sublime extension, you get to decide whether your code gets piped up into their model.

Not that I've verified it by monitoring network calls.


Same is true if you use `git push` in which case all the code is transferred through the wire and is collected by GitHub which may or may not be desirable.


git != GitHub


I think OP's point was that GitHub=Microsoft, so you're effectively sending your code to Microsoft in one way or another. Although the licensing/privacy policies are probably different for private repositories.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: