نتایج جستجو برای: sgd

تعداد نتایج: 1169  

Journal: :CoRR 2015
Thomas M. Breuel

Neural networks are usually trained by some form of stochastic gradient descent (SGD)). A number of strategies are in common use intended to improve SGD optimization, such as learning rate schedules, momentum, and batching. These are motivated by ideas about the occurrence of local minima at different scales, valleys, and other phenomena in the objective function. Empirical results presented he...

2012
Bo Deng Li-qun Jia Huang-ying Tan Xuan Yao Fu-yun Gao Lin Pan Jian Cui Qing Xiang

Bone metastasis (BM) is a major clinical problem for which current treatments lack full efficacy. The Traditional Chinese Medicine (TCM) Sangu Decoction (SGD) has been widely used to treat BM in China. However, no in vivo experiments to date have investigated the effects of TCM on osteoclast activity in BM. In this study, the protective effect and probable mechanism of SGD were evaluated. The m...

Journal: :Journal of Machine Learning Research 2012
Zhuang Wang Koby Crammer Slobodan Vucetic

Online algorithms that process one example at a time are advantageous when dealing with very large data or with data streams. Stochastic Gradient Descent (SGD) is such an algorithm and it is an attractive choice for online Support Vector Machine (SVM) training due to its simplicity and effectiveness. When equipped with kernel functions, similarly to other SVM learning algorithms, SGD is suscept...

Journal: :CoRR 2017
Nitish Shirish Keskar Richard Socher

Despite superior training outcomes, adaptive optimization methods such as Adam, Adagrad or RMSprop have been found to generalize poorly compared to Stochastic gradient descent (SGD). These methods tend to perform well in the initial portion of training but are outperformed by SGD at later stages of training. We investigate a hybrid strategy that begins training with an adaptive method and switc...

Journal: :Journal of Machine Learning Research 2017
Stephan Mandt Matthew D. Hoffman David M. Blei

Stochastic Gradient Descent with a constant learning rate (constant SGD) simulates a Markov chain with a stationary distribution. With this perspective, we derive several new results. (1) We show that constant SGD can be used as an approximate Bayesian posterior inference algorithm. Specifically, we show how to adjust the tuning parameters of constant SGD to best match the stationary distributi...

Journal: :CoRR 2018
Cong Xie Oluwasanmi Koyejo Indranil Gupta

We propose three new robust aggregation rules for distributed synchronous Stochastic Gradient Descent (SGD) under a general Byzantine failure model. The attackers can arbitrarily manipulate the data transferred between the servers and the workers in the parameter server (PS) architecture. We prove the Byzantine resilience properties of these aggregation rules. Empirical analysis shows that the ...

Journal: :CoRR 2016
Jianmin Chen Rajat Monga Samy Bengio Rafal Józefowicz

Distributed training of deep learning models on large-scale training data is typically conducted with asynchronous stochastic optimization to maximize the rate of updates, at the cost of additional noise introduced from asynchrony. In contrast, the synchronous approach is often thought to be impractical due to idle time wasted on waiting for straggling workers. We revisit these conventional bel...

Journal: :CoRR 2013
Shenghuo Zhu

With a weighting scheme proportional to t, a traditional stochastic gradient descent (SGD) algorithm achieves a high probability convergence rate of O(κ/T ) for strongly convex functions, instead of O(κ ln(T )/T ). We also prove that an accelerated SGD algorithm also achieves a rate of O(κ/T ).

Journal: :CoRR 2018
Weijie Su Yuancheng Zhu

Stochastic gradient descent (SGD) is an immensely popular approach for online learningin settings where data arrives in a stream or data sizes are very large. However, despite anever-increasing volume of work on SGD, much less is known about the statistical inferentialproperties of SGD-based predictions. Taking a fully inferential viewpoint, this paper introducesa novel proc...

2014
Steffen Porwollik Carlos A. Santiviago Pui Cheng Fred Long Prerak Desai Jennifer Fredlund Shabarinath Srikumar Cecilia A. Silva Weiping Chu Xin Chen Rocío Canals M. Megan Reynolds Lydia Bogomolnaya Christine Shields Ping Cui Jinbai Guo Yi Zheng Tiana Endicott-Yazdani Hee-Jeong Yang Aimee Maple Yury Ragoza Carlos J. Blondel Camila Valenzuela Helene Andrews-Polymenis Michael McClelland

We constructed two collections of targeted single gene deletion (SGD) mutants and two collections of targeted multi-gene deletion (MGD) mutants in Salmonella enterica sv Typhimurium 14028s. The SGD mutant collections contain (1), 3517 mutants in which a single gene is replaced by a cassette containing a kanamycin resistance (KanR) gene oriented in the sense direction (SGD-K), and (2), 3376 muta...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید