Distributed statistical learning problems arise commonly when dealing with large datasets. In this setup, datasets are partitioned over machines, which compute locally, and communicate short messages. Communication is often the bottleneck. paper, we study one-step iterative weighted parameter averaging in linear models under data parallelism. We do regression on each machine, send results to a ...