I think it's fantastic that this problem has people's attention.
I have long wished that code changes to a repository could be accompanied by code refactorings that are intended to be applied to code using that repository. For example, if you rename f() to g(), then you could accompany this by a refactoring that transforms existing callers of f() to use g() as well. I'd envision this as a build step that tells you that automated repairs are available.
The refactoring could be a small but limited program of its own, that is evaluated against the program abstract syntax graph, and that can be as powerful as is warranted or needed to properly transform programs using the code. Moving or renaming code could be a relatively simple type of refactoring. However, if you've renamed f(int) to g(int, int) such that callers of f(N) should call g(N, 0), then a slightly more complex refactoring script could handle that too. I would think of these refactoring scripts as something like how Git treats changes during a rebase: if you are far behind you might need to apply multiple of them to your code base in sequence to bring code up to date.
The article points out an important need for gradual repair to be possible. Along with a way to express that transition backwards-compatibly for a period of time, an automated way to apply the refactoring steps could make adoption even easier. In this fashion, refactoring and improvements could have far lower costs for libraries, APIs, etc. than they do in today's languages.
> The refactoring could be a small but limited program of its own, that is evaluated against the program abstract syntax graph, and that can be as powerful as is warranted or needed to properly transform programs using the code. Moving or renaming code could be a relatively simple type of refactoring. However, if you've renamed f(int) to g(int, int) such that callers of f(N) should call g(N, 0), then a slightly more complex refactoring script could handle that too.
Go has such a tool already in the form of `gofmt -r`. You give it a semantic rewrite rule like `a[b:len(a)] -> a[b:]` and it runs through the AST and rewrites the code for you. When go was in development (pre go 1.0) the developers used this mechanism quite often to make backwards incompatible changes to the language.
I have long wished that code changes to a repository could be accompanied by code refactorings that are intended to be applied to code using that repository
This was entirely feasible in Smalltalk. Not sure if people ever applied this outside the context of ORMs/database mapping. (Some shops did something like what you are describing in that limited context.) A short script that applied a specific refactoring could be placed in a specially named method on a particular class, for example. (Or some other mechanism could be used to store metadata on what versions the refactoring was applicable to.) It was also possible to quickly hack a GUI tool that you could bring up to selectively apply the refactorings.
Also, after the advent of "The Refactoring Browser" the parser engine was also used to build the refactorings directly built into the "IDE." (It's actually where the parser engine came from.)
(Not only did this rewrite engine have full syntactic power, with all values wild-carded, one could also script against values in the parse tree or provide the values from snippets of code.)
I think they mean the "refactoring" should be shipped as part of the release to be applied by codebases depending on the library, possibly automatically or semi-automatically, it's not about a formal refactoring of the "current" codebase which most modern IDEs can do and which is pretty easy in statically typed languages.
Though what they're talking about would generally be considered an API change not a refactoring.
I think they mean the "refactoring" should be shipped as part of the release to be applied by codebases depending on the library, possibly automatically or semi-automatically
That's exactly what I'm talking about! For example, in the StORE version control that comes with VisualWorks Smalltalk, it would be fairly easy to detect the presence of a particular class and method, whose name contains metadata about version, then pop up a window showing the potential refactorings.
The go equivalent would be to have a dist-refactorings subdirectory, and a go tool. Maybe dist-refactor?
go dist-refactor ...
The refactoring scripts would have names that contain version information. Invocation of dist-refactor would cause a list of available refactoring scripts to be applied to the codebase from all of the dependencies that are at a version later than the one compiled. Alternatively, one could invoke
As others pointed out, Go actually has some tooling for that, in the form of "go fmt" and "go fix".
Still, it only works for refactorings simple enough to be applied automatically, which on the other hand are the simplest to maintain compatibility with (writing a stub with the old name/arguments that calls the new function).
This should extend beyond code and cover data, too. If you replace a bool field in a class by an enum, for instance, and then read in (or, for systems with a REPL, page in; if your system has paged out data, paging in every page to check it for the presence of to-be-migrated objects on every structural change to any class makes making changes too slow) a file that serialized the bool values, they should automatically be upgraded to use the enum.
There was/is tons of work on that, often using Common Lisp and the meta-object protocol because of its excellent reflection capabilities and change-class ()
> an automated way to apply the refactoring steps could make refactoring a breeze
It is called alias, a feature that should have been there a long time ago but "thanks" to a minority of gophers it's still not there. Go fundamental design issue is that it conflates namespaces with urls. It's not something that can be fixed by a third party tool.
> Go fundamental design issue is that it conflates namespaces with urls.
No, there is nothing in the design of the Go language that conflates namespaces with URLs. Import paths are just strings. They are interpreted as directories under $GOPATH or ./vendor on the file system by the current Go compiler from golang.org.
One tool called 'go get', which helps to put Go source code files into these directories, does interpret them as URLs. But other implementations of the Go specification like gccgo don't even have the 'go' tool (and thus no 'go get').
Import paths don't even have to be interpreted as directory paths. They could be interpreted e.g. as database queries by other Go spec implementations.
I'm not a Go user myself, and my comment is about programming in general rather than Go specifically (though Go seems to be leading the way in this area).
Does alias help Go programs automatically transform themselves? Alias seems to provide direct compatibility between types, but doesn't seem to provide any way to automatically refactor code based on the alias, does it? https://github.com/golang/go/issues/16339
Basically what I'm thinking is something like an alias feature where when A has been renamed to B, such that the name A has been left for backwards compatibility as an alias to B, then an automation layer could also help by automatically offering to rename A to B in existing code.
I think the reason it's not normally done is that for small open source projects, there are only a handful of usages and each author can easily do it manually. For a large monorepo, we do have (or are building) specialized search-and-replace tools to replace deprecated API usages with usages of the new API.
To scale this in the open source world, perhaps someone could search Github for Go projects and send out pull requests automatically. But people writing non-opensource code would still be pretty much on their own.
I'm imagining the refactoring rules as something that could perhaps ship alongside the library as part of its development release. Like, let's say we've released the latest version of our library which renames A to B. The release could include a refactoring rule that describes how to update from previous versions of the library to the latest version: a rule that identifies all references to A and replaces them with B. References that tooling cannot identify and fix can also still work for a time via the alias feature.
To make these rules easy to author, perhaps we could provide some form of assistance at the source control layer -- perhaps we could infer and propose refactorings based on the diffs that we see. If we see a diff renaming the method A to B, then we could see that and offer to construct a refactoring rule for all consuming code. Or with full IDE support, the IDE refactoring tool could call into the language service refactoring function to construct the rule.
To refactor in this language and platform, you'd rename your method, stage the commit, and then ask the language to analyze the change. It would detect the rename, offer to create a refactoring rule, and then you could accept that rule and apply it to your own code base (internal usage). You could have the option to save that refactoring rule as part of the package release. It could say: when consuming code upgrades from version 7 of this library to version 8 (or commit hash xxx to yyy), apply these refactorings. Indeed, even if the library does not ship refactoring rules -- or as an alternative to that model -- users could run the inference tool on the diff of changes to the library source. Shipping refactoring rules in the release would allow this scheme to work even if users don't have access to library source, though.
Imagine our library has hundreds of consumers on GitHub. When they next pick up a version of our library, and try to build the source, our build system could inform the user that refactorings are available and offer to apply them. The user clicks "accept", reviews the changes, and hopefully the package builds and its tests pass after that.
If we make this system really reliable, then we could apply these changes as the default course of action. Ideally the user would apply the refactorings and commit the changes to their source code, but even if the user doesn't, we could apply the changes to a temporary copy every time they build. Or the compiler could do it semantically for them, depending on how the rules work. This way even users who aren't willing to actively participate can still benefit from the capability. If your consuming package falls far behind the latest library, then the set of refactorings you need to apply to your code in order to build it could grow quite large and brittle, but it may still work (A changes to B today, and is moved into another package tomorrow). Like a rebase of many commits, you'd apply the rules one by one.
The goal would be to make it really easy to ship these refactoring rules as part of releases as a library vendor, and really easy to apply the rules as a consumer. The typical change would be small quality of life changes like renames and moves.
I wonder how a capability like this could transform the way we release software. Today, releasing a breaking change is anathema. It's anathema because we know it causes massive pain for users. If we had the ability to ease that pain, and make the upgrade process really easy or even automatic, then it could perhaps significantly change the way in which we think about interface contracts. Everyone who builds libraries is familiar with a time where you got the interface wrong, and really wish you could change it, but it's too late now because the library is too entrenched. This kind of system would make it possible for us to dig ourselves out of those problems and continuously improve even codebases that are widely used. Of course I have my head in the stars at this point, but I believe that all of this is plausible.
[I'm not a Go user, but these concepts have been something I've wanted to explore as features of a pet programming language I've been designing off-and-on.]
And yet interface compatability/continuity would still be important unless you can automatically change all of the deployed instances without inducing any race conditions or transition issues. This can only be done through a backwards compatible layer or bifurcation of the install base. So really, nothing would be fundamentally different.
And yet interface compatability/continuity would still be important unless you can automatically change all of the deployed instances without inducing any race conditions or transition issues.
The VisualWorks Package system could theoretically be applied at runtime. It would create a shadow of the meta-level objects, then the new meta-level could be applied atomically with "shape changes" of instantiated objects happening all at once.
This could 1) still incur a pause in the program and 2) in practice, bugs surfaced preventing the widespread use of the facility in running servers.
I have long wished that code changes to a repository could be accompanied by code refactorings that are intended to be applied to code using that repository. For example, if you rename f() to g(), then you could accompany this by a refactoring that transforms existing callers of f() to use g() as well. I'd envision this as a build step that tells you that automated repairs are available.
The refactoring could be a small but limited program of its own, that is evaluated against the program abstract syntax graph, and that can be as powerful as is warranted or needed to properly transform programs using the code. Moving or renaming code could be a relatively simple type of refactoring. However, if you've renamed f(int) to g(int, int) such that callers of f(N) should call g(N, 0), then a slightly more complex refactoring script could handle that too. I would think of these refactoring scripts as something like how Git treats changes during a rebase: if you are far behind you might need to apply multiple of them to your code base in sequence to bring code up to date.
The article points out an important need for gradual repair to be possible. Along with a way to express that transition backwards-compatibly for a period of time, an automated way to apply the refactoring steps could make adoption even easier. In this fashion, refactoring and improvements could have far lower costs for libraries, APIs, etc. than they do in today's languages.