full_info,tags "intrusion detection systems using adaptive regression splines. past years witnessed growing recognition intelligent techniques construction efficient reliable intrusion detection systems. due increasing incidents cyber attacks, building effective intrusion detection systems (ids) essential protecting information systems security, yet remains elusive goal great challenge. paper, report performance analysis multivariate adaptive regression splines (mars), neural networks support vector machines. mars procedure builds flexible regression models fitting separate splines distinct intervals predictor variables. brief comparison different neural network learning algorithms also given.",4 "deep value networks learn evaluate iteratively refine structured outputs. approach structured output prediction optimizing deep value network (dvn) precisely estimate task loss different output configurations given input. model trained, perform inference gradient descent continuous relaxations output variables find outputs promising scores value network. applied image segmentation, value network takes image segmentation mask inputs predicts scalar estimating intersection union input ground truth masks. multi-label classification, dvn's objective correctly predict f1 score potential label configuration. dvn framework achieves state-of-the-art results multi-label prediction image segmentation benchmarks.",4 "financial portfolio optimization: computationally guided agents investigate, analyse invest!?. financial portfolio optimization widely studied problem mathematics, statistics, financial computational literature. adheres determining optimal combination weights associated financial assets held portfolio. practice, faces challenges virtue varying math. formulations, parameters, business constraints complex financial instruments. empirical nature data longer one-sided; thereby reflecting upside downside trends repeated yet unidentifiable cyclic behaviours potentially caused due high frequency volatile movements asset trades. portfolio optimization circumstances theoretically computationally challenging. work presents novel mechanism reach optimal solution encoding variety optimal solutions solution bank guide search process global investment objective formulation. conceptualizes role individual solver agents contribute optimal solutions bank solutions, super-agent solver learns solution bank, and, thus reflects knowledge-based computationally guided agents approach investigate, analyse reach optimal solution informed investment decisions. conceptual understanding classes solver agents represent varying problem formulations and, mathematically oriented deterministic solvers along stochastic-search driven evolutionary swarm-intelligence based techniques optimal weights discussed. algorithmic implementation presented enhanced neighbourhood generation mechanism simulated annealing algorithm. framework inclusion heuristic knowledge human expertise financial literature related investment decision making process reflected via introduction controlled perturbation strategies using decision matrix neighbourhood generation.",17 "adaptive visualisation system construction building information models using saliency. building information modeling (bim) recent construction process based 3d model, containing every component related building achievement. architects, structure engineers, method engineers, others participant building process work model design-to-construction cycle. high complexity large amount information included models raise several issues, delaying wide adoption industrial world. one important visualization: professionals difficulties find relevant information job. actual solutions suffer two limitations: bim models information processed manually insignificant information simply hidden, leading inconsistencies building model. paper describes system relying ontological representation building information label automatically building elements. depending user's department, visualization modified according labels automatically adjusting colors image properties based saliency model. proposed saliency model incorporates several adaptations fit specificities architectural images.",4 "making sensitivity analysis computationally efficient. investigate robustness output probabilities bayesian network, sensitivity analysis performed. one-way sensitivity analysis establishes, probability parameters network, function expressing posterior marginal probability interest terms parameter. current methods computing coefficients function rely large number network evaluations. paper, present method requires single outward propagation junction tree establishing coefficients functions possible parameters; addition, inward propagation required processing evidence. conversely, method requires single outward propagation computing coefficients functions expressing possible posterior marginals terms single parameter. extend results n-way sensitivity analysis sets parameters studied.",4 "explanation mechanism bayesian inferencing systems. explanation facilities particularly important feature expert system frameworks. area traditional rule-based expert system frameworks mixed results. explanations control well handled, facilities needed generating better explanations concerning knowledge base content. paper approaches explanation problem examining effect event variable interest within symmetric bayesian inferencing system. argue effect measure operating context must satisfy certain properties. measure proposed. forms basis explanation facility allows user generalized bayesian inferencing system question meaning knowledge base. facility described detail.",4 "foodnet: recognizing foods using ensemble deep networks. work propose methodology automatic food classification system recognizes contents meal images food. developed multi-layered deep convolutional neural network (cnn) architecture takes advantages features deep networks improves efficiency. numerous classical handcrafted features approaches explored, among cnns chosen best performing features. networks trained fine-tuned using preprocessed images filter outputs fused achieve higher accuracy. experimental results largest real-world food recognition database eth food-101 newly contributed indian food image database demonstrate effectiveness proposed methodology compared many benchmark deep learned cnn frameworks.",4 "natural language emerge 'naturally' multi-agent dialog. number recent works proposed techniques end-to-end learning communication protocols among cooperative multi-agent populations, simultaneously found emergence grounded human-interpretable language protocols developed agents, learned without human supervision! paper, using task tell reference game two agents testbed, present sequence 'negative' results culminating 'positive' one -- showing agent-invented languages effective (i.e. achieve near-perfect task rewards), decidedly interpretable compositional. essence, find natural language emerge 'naturally', despite semblance ease natural-language-emergence one may gather recent literature. discuss possible coax invented languages become human-like compositional increasing restrictions two agents may communicate.",4 "landmark-guided elastic shape analysis human character motions. motions virtual characters movies video games typically generated recording actors using motion capturing methods. animations generated way often need postprocessing, improving periodicity cyclic animations generating entirely new motions interpolation existing ones. furthermore, search classification recorded motions becomes important amount recorded motion data grows. paper, apply methods shape analysis processing animations. precisely, use classical elastic metric model used shape matching, extend incorporating additional inexact feature point information, leads improved temporal alignment different animations.",4 "long-term multi-granularity deep framework driver drowsiness detection. real-world driver drowsiness detection videos, variation head pose large existing methods global face capable extracting effective features, looking aside lowering head. temporal dependencies variable length also rarely considered previous approaches, e.g., yawning speaking. paper, propose long-term multi-granularity deep framework detect driver drowsiness driving videos containing frontal faces. framework includes two key components: (1) multi-granularity convolutional neural network (mcnn), novel network utilizes group parallel cnn extractors well-aligned facial patches different granularities, extracts facial representations effectively large variation head pose, furthermore, flexibly fuse detailed appearance clues main parts local global spatial constraints; (2) deep long short term memory network applied facial representations explore long-term relationships variable length sequential frames, capable distinguish states temporal dependencies, blinking closing eyes. approach achieves 90.05% accuracy 37 fps speed evaluation set public nthu-ddd dataset, state-of-the-art method driver drowsiness detection. moreover, build new dataset named fi-ddd, higher precision drowsy locations temporal dimension.",4 "consistent query answering via asp different perspectives: theory practice. data integration system provides transparent access different data sources suitably combining data, providing user unified view them, called global schema. however, source data generally control data integration process, thus integrated data may violate global integrity constraints even presence locally-consistent data sources. scenario, may anyway interesting retrieve much consistent information possible. process answering user queries global constraint violations called consistent query answering (cqa). several notions cqa proposed, e.g., depending whether integrated information assumed sound, complete, exact variant them. paper provides contribution setting: uniforms solutions coming different perspectives common asp-based core, provides query-driven optimizations designed isolating eliminating inefficiencies general approach computing consistent answers. moreover, paper introduces new theoretical results enriching existing knowledge decidability complexity considered problems. effectiveness approach evidenced experimental results. appear theory practice logic programming (tplp).",4 "using simulated annealing calculate trembles trembling hand perfection. within literature non-cooperative game theory, number attempts propose logorithms compute nash equilibria. rather derive new algorithm, paper shows family algorithms known markov chain monte carlo (mcmc) used calculate nash equilibria. mcmc type monte carlo simulation relies markov chains ensure regularity conditions. mcmc widely used throughout statistics optimization literature, variants algorithm known simulated annealing. paper shows interesting connection trembles underlie functioning algorithm type nash refinement known trembling hand perfection.",4 "neuroevolution edge chaos. echo state networks represent special type recurrent neural networks. recent papers stated echo state networks maximize computational performance transition order chaos, so-called edge chaos. work confirms statement comprehensive set experiments. furthermore, echo state networks compared networks evolved via neuroevolution. evolved networks outperform echo state networks, however, evolution consumes significant computational resources. demonstrated echo state networks local connections combine best worlds, simplicity random echo state networks performance evolved networks. finally, shown evolution tends stay close ordered side edge chaos.",4 "general factorization framework context-aware recommendations. context-aware recommendation algorithms focus refining recommendations considering additional information, available system. topic gained lot attention recently. among others, several factorization methods proposed solve problem, although assume explicit feedback strongly limits real-world applicability. algorithms apply various loss functions optimization strategies, preference modeling context less explored due lack tools allowing easy experimentation various models. context dimensions introduced beyond users items, space possible preference models importance proper modeling largely increases. paper propose general factorization framework (gff), single flexible algorithm takes preference model input computes latent feature matrices input dimensions. gff allows us easily experiment various linear models context-aware recommendation task, explicit implicit feedback based. scaling properties makes usable real life circumstances well. demonstrate framework's potential exploring various preference models 4-dimensional context-aware problem contexts available almost real life datasets. show experiments -- performed five real life, implicit feedback datasets -- proper preference modelling significantly increases recommendation accuracy, previously unused models outperform traditional ones. novel models gff also outperform state-of-the-art factorization algorithms. also extend method fully compliant multidimensional dataspace model, one extensive data models context-enriched data. extended gff allows seamless incorporation information fac[truncated]",4 "prediction restricted resources finite automata. obtain index complexity random sequence allowing role measure classical probability theory played function call generating mechanism. typically, generating mechanism finite automata. generate set biased sequences applying finite state automata specified number, $m$, states set binary sequences. thus index complexity random sequence number states automata. detail optimal algorithms predict sequences generated way.",19 "deep neural networks stress. recent years, deep architectures used transfer learning state-of-the-art performance many datasets. properties features remain, however, largely unstudied transfer perspective. work, present extensive analysis resiliency feature vectors extracted deep models, special focus trade-off performance compression rate. introducing perturbations image descriptions extracted deep convolutional neural network, change precision number dimensions, measuring affects final score. show deep features robust disturbances compared classical approaches, achieving compression rate 98.4%, losing 0.88% original score pascal voc 2007.",4 "making neural qa simple possible simpler. recent development large-scale question answering (qa) datasets triggered substantial amount research end-to-end neural architectures qa. increasingly complex systems conceived without comparison simpler neural baseline systems would justify complexity. work, propose simple heuristic guides development neural baseline systems extractive qa task. find two ingredients necessary building high-performing neural qa system: first, awareness question words processing context second, composition function goes beyond simple bag-of-words modeling, recurrent neural networks. results show fastqa, system meets two requirements, achieve competitive performance compared existing models. argue surprising finding puts results previous systems complexity recent qa datasets perspective.",4 "translation-based constraint answer set solving. solve constraint satisfaction problems translation answer set programming (asp). reformulations property unit-propagation asp solver achieves well defined local consistency properties like arc, bound range consistency. experiments demonstrate computational value approach.",4 "api design machine learning software: experiences scikit-learn project. scikit-learn increasingly popular machine learning li- brary. written python, designed simple efficient, accessible non-experts, reusable various contexts. paper, present discuss design choices application programming interface (api) project. particular, describe simple elegant interface shared learning processing units library discuss advantages terms composition reusability. paper also comments implementation details specific python ecosystem analyzes obstacles faced users developers library.",4 "metalearning feature selection. general formulation optimization problems various candidate solutions may use different feature-sets presented, encompassing supervised classification, automated program learning cases. novel characterization concept ""good quality feature"" optimization problem provided; proposal regarding integration quality based feature selection metalearning suggested, wherein quality feature problem estimated using knowledge related features context related problems. results presented regarding extensive testing ""feature metalearning"" approach supervised text classification problems; demonstrated that, context, feature metalearning provide significant sometimes dramatic speedup standard feature selection heuristics.",4 "nonlinear supervised dimensionality reduction via smooth regular embeddings. recovery intrinsic geometric structures data collections important problem data analysis. supervised extensions several manifold learning approaches proposed recent years. meanwhile, existing methods primarily focus embedding training data, generalization embedding initially unseen test data rather ignored. work, build recent theoretical results generalization performance supervised manifold learning algorithms. motivated performance bounds, propose supervised manifold learning method computes nonlinear embedding constructing smooth regular interpolation function extends embedding whole data space order achieve satisfactory generalization. embedding interpolator jointly learnt lipschitz regularity interpolator imposed ensuring separation different classes. experimental results several image data sets show proposed method yields quite satisfactory performance comparison supervised dimensionality reduction algorithms traditional classifiers.",4 "regularization methods learning incomplete matrices. use convex relaxation techniques provide sequence solutions matrix completion problem. using nuclear norm regularizer, provide simple efficient algorithms minimizing reconstruction error subject bound nuclear norm. algorithm iteratively replaces missing elements obtained thresholded svd. warm starts allows us efficiently compute entire regularization path solutions.",19 "multimodal latent variable analysis. consider set multiple, multimodal sensors capturing complex system physical phenomenon interest. primary goal distinguish underlying sources variability manifested measured data. first step analysis find common source variability present sensor measurements. base work recent paper, tackles problem alternating diffusion (ad). work, suggest analysis extracting sensor-specific variables addition common source. propose algorithm, analyze theoretically, demonstrate three different applications: synthetic example, toy problem, task fetal ecg extraction.",4 "topic supervised non-negative matrix factorization. topic models extensively used organize interpret contents large, unstructured corpora text documents. although topic models often perform well traditional training vs. test set evaluations, often case results topic model align human interpretation. interpretability fallacy largely due unsupervised nature topic models, prohibits user guidance results model. paper, introduce semi-supervised method called topic supervised non-negative matrix factorization (ts-nmf) enables user provide labeled example documents promote discovery meaningful semantic structure corpus. way, results ts-nmf better match intuition desired labeling user. core ts-nmf relies solving non-convex optimization problem derive iterative algorithm shown monotonic convergent local optimum. demonstrate practical utility ts-nmf reuters pubmed corpora, find ts-nmf especially useful conceptual broad topics, topic key terms well understood. although identifying optimal latent structure data primary objective proposed approach, find ts-nmf achieves higher weighted jaccard similarity scores contemporary methods, (unsupervised) nmf latent dirichlet allocation, supervision rates low 10% 20%.",4 "bayesian online changepoint detection. changepoints abrupt variations generative parameters data sequence. online detection changepoints useful modelling prediction time series application areas finance, biometrics, robotics. frequentist methods yielded online filtering prediction techniques, bayesian papers focused retrospective segmentation problem. examine case model parameters changepoint independent derive online algorithm exact inference recent changepoint. compute probability distribution length current ``run,'' time since last changepoint, using simple message-passing algorithm. implementation highly modular algorithm may applied variety types data. illustrate modularity demonstrating algorithm three different real-world data sets.",19 "survey estimation mutual information methods measure dependency versus correlation analysis. survey, present compare different approaches estimate mutual information (mi) data analyse general dependencies variables interest system. demonstrate performance difference mi versus correlation analysis, optimal case linear dependencies. first, use piece-wise constant bayesian methodology using general dirichlet prior. estimation method, use two-stage approach approximate probability distribution first calculate marginal joint entropies. here, demonstrate performance bayesian approach versus others computing dependency different variables. also compare linear correlation analysis. finally, apply mi correlation analysis identification bias determination aerosol optical depth (aod) satellite based moderate resolution imaging spectroradiometer (modis) ground based aerosol robotic network (aeronet). here, observe aod measurements two instruments might different location. reason bias explored quantifying dependencies bias 15 variables including cloud cover, surface reflectivity others.",19 "regional active contours based variational level sets machine learning image segmentation. image segmentation problem partitioning image different subsets, subset may different characterization terms color, intensity, texture, and/or features. segmentation fundamental component image processing, plays significant role computer vision, object recognition, object tracking. active contour models (acms) constitute powerful energy-based minimization framework image segmentation, relies concept contour evolution. starting initial guess, contour evolved aim approximating better better actual object boundary. handling complex images efficient, effective, robust way real challenge, especially presence intensity inhomogeneity, overlap foreground/background intensity distributions, objects characterized many different intensities, and/or additive noise. thesis, deal challenges, propose number image segmentation models relying variational level set methods specific kinds neural networks, handle complex images supervised unsupervised ways. experimental results demonstrate high accuracy segmentation results, obtained proposed models various benchmark synthetic real images compared state-of-the-art active contour models.",4 "promise peril human evaluation model interpretability. transparency, user trust, human comprehension popular ethical motivations interpretable machine learning. support goals, researchers evaluate model explanation performance using humans real world applications. alone presents challenge many areas artificial intelligence. position paper, propose distinction descriptive persuasive explanations. discuss reasoning suggesting functional interpretability may correlated cognitive function user preferences. indeed case, evaluation optimization using functional metrics could perpetuate implicit cognitive bias explanations threaten transparency. finally, propose two potential research directions disambiguate cognitive function explanation models, retaining control tradeoff accuracy interpretability.",4 "assisting composition email responses: topic prediction approach. propose approach helping agents compose email replies customer requests. enable that, use lda extract latent topics collection email exchanges. use latent topics label data, obtaining so-called ""silver standard"" topic labelling. exploit labelled set train classifier to: (i) predict topic distribution entire agent's email response, based features customer's email; (ii) predict topic distribution next sentence agent's reply, based customer's email features features agent's current sentence. experimental results large email collection contact center tele- com domain show proposed ap- proach effective predicting best topic agent's next sentence. 80% cases, correct topic present among top five recommended topics (out fifty possible ones). shows potential method applied interactive setting, agent presented small list likely topics choose next sentence.",4 "high-order graph convolutional recurrent neural network: deep learning framework network-scale traffic learning forecasting. traffic forecasting challenging task, due complicated spatial dependencies roadway networks time-varying traffic patterns. address challenge, learn traffic network graph propose novel deep learning framework, high-order graph convolutional long short-term memory neural network (hgc-lstm), learn interactions links traffic network forecast network-wide traffic state. define high-order traffic graph convolution based physical network topology. proposed framework employs l1-norms graph convolution weights l2-norms graph convolution features identify influential links traffic network. propose novel real-time branching learning (rtbl) algorithm hgc-lstm framework accelerate training process spatio-temporal data. experiments show hgc-lstm network able capture complex spatio-temporal dependencies efficiently present traffic network consistently outperforms state-of-the-art baseline methods two heterogeneous real-world traffic datasets. visualization graph convolution weights shows proposed framework accurately recognize influential roadway segments real-world traffic networks.",4 "deep q-learning agent l-game variable batch training. employ deep q-learning algorithm experience replay train agent capable achieving high-level play l-game self-learning low-dimensional states. also employ variable batch size training order mitigate loss rare reward signal significantly accelerate training. despite large action space due number possible moves, low-dimensional state space rarity rewards, come end game, dql successful training agent capable strong play without use search methods domain knowledge.",4 "modelling legal contracts processes. paper concentrates representation legal relations obtain parties entered contractual agreement evolution agreement progresses time. contracts regarded process analysed terms obligations active various points life span. informal notation introduced summarizes conveniently states agreement evolves time. representation enables us determine status agreement is, given event sequence events concern performance actions agents involved. useful context contract drafting (where parties might wish preview agreement might evolve) context contract performance monitoring (where parties might establish legal positions agreement force). discussion based example illustrates typical patterns contractual obligations.",4 "towards automation data quality system cern cms experiment. daily operation large-scale experiment challenging task, particularly perspectives routine monitoring quality data taken. describe approach uses machine learning automated system monitor data quality, based partial use data qualified manually detector experts. system automatically classifies marginal cases: good bad data, use human expert decision classify remaining ""grey area"" cases. study uses collision data collected cms experiment lhc 2010. demonstrate proposed workflow able automatically process least 20\% samples without noticeable degradation result.",15 "multi-vehicle covering tour problem: building routes urban patrolling. paper study particular aspect urban community policing: routine patrol route planning. seek routes guarantee visibility, sizable impact community perceived safety, allowing quick emergency responses providing surveillance selected sites (e.g., hospitals, schools). planning restricted availability vehicles strives achieve balanced routes. study adaptation model multi-vehicle covering tour problem, set locations must visited, whereas another subset must close enough planned routes. constitutes np-complete integer programming problem. suboptimal solutions obtained several heuristics, adapted literature others developed us. solve adapted instances tsplib instance real data, former compared results literature, latter compared empirical data.",4 "learning depth monocular videos using direct methods. ability predict depth single image - using recent advances cnns - increasing interest vision community. unsupervised strategies learning particularly appealing utilize much larger varied monocular video datasets learning without need ground truth depth stereo. previous works, separate pose depth cnn predictors determined joint outputs minimized photometric error. inspired recent advances direct visual odometry (dvo), argue depth cnn predictor learned without pose cnn predictor. further, demonstrate empirically incorporation differentiable implementation dvo, along novel depth normalization strategy - substantially improves performance state art use monocular videos training.",4 regular expressions decoding neural network outputs. article proposes convenient tool decoding output neural networks trained connectionist temporal classification (ctc) handwritten text recognition. use regular expressions describe complex structures expected writing. corresponding finite automata employed build decoder. analyze theoretically calculations relevant avoided. great speed-up results approximation. conclude approximation likely fails regular expression match ground truth harmful many applications since low probability even underestimated. proposed decoder efficient compared decoding methods. variety applications reaches information retrieval full text recognition. refer applications integrated proposed decoder successfully.,4 "swap swap? exploiting dependency word pairs reordering statistical machine translation. reordering poses major challenge machine translation (mt) two languages significant differences word order. paper, present novel reordering approach utilizing sparse features based dependency word pairs. instance features captures whether two words, related dependency link source sentence dependency parse tree, follow order swapped translation output. experiments chinese-to-english translation show statistically significant improvement 1.21 bleu point using approach, compared state-of-the-art statistical mt system incorporates prior reordering approaches.",4 "supervised learning sparse context reconstruction coefficients data representation classification. context data points, usually defined data points data set, found play important roles data representation classification. paper, study problem using context data point classification problem. work inspired observation actually data points critical context data point representation classification. propose represent data point sparse linear combination context, learn sparse context supervised way increase discriminative ability. end, proposed novel formulation context learning, modeling learning context parameter classifier unified objective, optimizing alternative strategy iterative algorithm. experiments three benchmark data set show advantage state-of-the-art context-based data representation classification methods.",4 "subspace alignment domain adaptation. paper, introduce new domain adaptation (da) algorithm source target domains represented subspaces spanned eigenvectors. method seeks domain invariant feature space learning mapping function aligns source subspace target one. show solution corresponding optimization problem obtained simple closed form, leading extremely fast algorithm. present two approaches determine hyper-parameter method corresponding size subspaces. first approach tune size subspaces using theoretical bound stability obtained result. second approach, use maximum likelihood estimation determine subspace size, particularly useful high dimensional data. apart pca, propose subspace creation method outperform partial least squares (pls) linear discriminant analysis (lda) domain adaptation. test method various datasets show that, despite intrinsic simplicity, outperforms state art da methods.",4 "separators adjustment sets causal graphs: complete criteria algorithmic framework. principled reasoning identifiability causal effects non-experimental data important application graphical causal models. present algorithmic framework efficiently testing, constructing, enumerating $m$-separators ancestral graphs (ags), class graphical causal models represent uncertainty presence latent confounders. furthermore, prove reduction causal effect identification covariate adjustment $m$-separation subgraph directed acyclic graphs (dags) maximal ancestral graphs (mags). jointly, results yield constructive criteria characterize adjustment sets well minimal minimum adjustment sets identification desired causal effect multivariate exposures outcomes presence latent confounding. results extend several existing solutions special cases problems. efficient algorithms allowed us empirically quantify identifiability gap covariate adjustment do-calculus random dags, covering wide range scenarios. implementations algorithms provided r package dagitty.",4 "flexible interpretations: computational model dynamic uncertainty assessment. investigations reported paper center process dynamic uncertainty assessment interpretation tasks real domain. particular, interested nature control structure computer programs support multiple interpretation smooth transitions them, real time. step processing involves interpretation one input item appropriate re-establishment system's confidence correctness interpretation(s).",4 "stargan: unified generative adversarial networks multi-domain image-to-image translation. recent studies shown remarkable success image-to-image translation two domains. however, existing approaches limited scalability robustness handling two domains, since different models built independently every pair image domains. address limitation, propose stargan, novel scalable approach perform image-to-image translations multiple domains using single model. unified model architecture stargan allows simultaneous training multiple datasets different domains within single network. leads stargan's superior quality translated images compared existing models well novel capability flexibly translating input image desired target domain. empirically demonstrate effectiveness approach facial attribute transfer facial expression synthesis tasks.",4 "video object segmentation re-identification. conventional video segmentation methods often rely temporal continuity propagate masks. assumption suffers issues like drifting inability handle large displacement. overcome issues, formulate effective mechanism prevent target lost via adaptive object re-identification. specifically, video object segmentation re-identification (vs-reid) model includes mask propagation module reid module. former module produces initial probability map flow warping latter module retrieves missing instances adaptive matching. two modules iteratively applied, vs-reid records global mean (region jaccard boundary f measure) 0.699, best performance 2017 davis challenge.",4 "toward statistical mechanics four letter words. consider words network interacting letters, approximate probability distribution states taken network. despite intuition rules english spelling highly combinatorial (and arbitrary), find maximum entropy models consistent pairwise correlations among letters provide surprisingly good approximation full statistics four letter words, capturing ~92% multi-information among letters even ""discovering"" real words represented data pairwise correlations estimated. maximum entropy model defines energy landscape space possible words, local minima landscape account nearly two-thirds words used written english.",16 "flag n' flare: fast linearly-coupled adaptive gradient methods. consider first order gradient methods effectively optimizing composite objective form sum smooth and, potentially, non-smooth functions. present accelerated adaptive gradient methods, called flag flare, offer best worlds. achieve optimal convergence rate attaining optimal first-order oracle complexity smooth convex optimization. additionally, adaptively non-uniformly re-scale gradient direction adapt limited curvature available conform geometry domain. show theoretically empirically that, compounding effects acceleration adaptivity, flag flare highly effective many data fitting machine learning applications.",12 "least-squares fir models low-resolution mr data efficient phase-error compensation simultaneous artefact removal. signal space models phase-encode, frequency-encode directions presented extrapolation 2d partial kspace. using boxcar representation low-resolution spatial data, geometrical representation signal space vectors positive negative phase-encode directions, robust predictor constructed using series signal space projections. compared existing phase-correction methods require acquisition pre-determined set fractional kspace lines, proposed predictor found efficient, due capability exhibiting equivalent degree performance using half number fractional lines. robust filtering noisy data achieved using second signal space model frequency-encode direction, bypassing requirement prior highpass filtering operation. signal space constructed fourier transformed samples row low-resolution image. set fir filters estimated fitting least squares model signal space. partial kspace extrapolation using fir filters shown result artifact-free reconstruction, particularly respect gibbs ringing streaking type artifacts.",4 "process monitoring sequences system call count vectors. introduce methodology efficient monitoring processes running hosts corporate network. methodology based collecting streams system calls produced selected processes hosts, sending network monitoring server, machine learning algorithms used identify changes process behavior due malicious activity, hardware failures, software errors. methodology uses sequence system call count vectors data format handle large varying volumes data. unlike previous approaches, methodology introduced paper suitable distributed collection processing data large corporate networks. evaluate methodology laboratory setting real-life setup provide statistics characterizing performance accuracy methodology.",4 "egocentric pose recognition four lines code. tackle problem estimating 3d pose individual's upper limbs (arms+hands) chest mounted depth-camera. importantly, consider pose estimation everyday interactions objects. past work shows strong pose+viewpoint priors depth-based features crucial robust performance. egocentric views, hands arms observable within well defined volume front camera. call volume egocentric workspace. notable property hand appearance correlates workspace location. exploit correlation, classify arm+hand configurations global egocentric coordinate frame, rather local scanning window. greatly simplify architecture improves performance. propose efficient pipeline 1) generates synthetic workspace exemplars training using virtual chest-mounted camera whose intrinsic parameters match physical camera, 2) computes perspective-aware depth features entire volume 3) recognizes discrete arm+hand pose classes sparse multi-class svm. method provides state-of-the-art hand pose recognition performance egocentric rgb-d images real-time.",4 "fast amortized inference learning log-linear models randomly perturbed nearest neighbor search. inference log-linear models scales linearly size output space worst-case. often bottleneck natural language processing computer vision tasks output space feasibly enumerable large. propose method perform inference log-linear models sublinear amortized cost. idea hinges using gumbel random variable perturbations pre-computed maximum inner product search data structure access most-likely elements sublinear amortized time. method yields provable runtime accuracy guarantees. further, present empirical experiments imagenet word embeddings showing significant speedups sampling, inference, learning log-linear models.",4 "application s-transform hyper kurtosis based modified duo histogram equalized dic images pre-cancer detection. proposed hyper kurtosis based histogram equalized dic images enhances contrast preserving brightness. evolution development precancerous activity among tissues studied s-transform (st). significant variations amplitude spectra observed due increased medium roughness normal tissue observed time-frequency domain. randomness inhomogeneity tissue structures among human normal different grades dic tissues recognized st based timefrequency analysis. study offers simpler better way recognize substantial changes among different stages dic tissues, reflected spatial information containing within inhomogeneity structures different types tissue.",4 "applying fuzzy id3 decision tree software effort estimation. web effort estimation process predicting efforts cost terms money, schedule staff software project system. many estimation models proposed last three decades believed must purpose of: budgeting, risk analysis, project planning control, project improvement investment analysis. paper, investigate use fuzzy id3 decision tree software cost estimation; designed integrating principles id3 decision tree fuzzy set-theoretic concepts, enabling model handle uncertain imprecise data describing software projects, improve greatly accuracy obtained estimates. mmre pred used measures prediction accuracy study. series experiments reported using two different software projects datasets namely, tukutuku cocomo'81 datasets. results compared produced crisp version id3 decision tree.",4 "cma evolution strategy: tutorial. tutorial introduces cma evolution strategy (es), cma stands covariance matrix adaptation. cma-es stochastic, randomized, method real-parameter (continuous domain) optimization non-linear, non-convex functions. try motivate derive algorithm intuitive concepts requirements non-linear, non-convex search continuous domain.",4 "dynamic vulnerability map assess risk road network traffic utilization. le havre agglomeration (codah) includes 16 establishments classified seveso high threshold. literature, construct vulnerability maps help decision makers assess risk. approaches remain static take account population displacement estimation vulnerability. propose decision making tool based dynamic vulnerability map evaluate difficulty evacuation different sectors codah. use geographic information system (gis) visualize map evolves road traffic state detection communities large graphs algorithm.",4 "minimum description length induction, bayesianism, kolmogorov complexity. relationship bayesian approach minimum description length approach established. sharpen clarify general modeling principles mdl mml, abstracted ideal mdl principle defined bayes's rule means kolmogorov complexity. basic condition ideal principle applied encapsulated fundamental inequality, broad terms states principle valid data random, relative every contemplated hypothesis also hypotheses random relative (universal) prior. basically, ideal principle states prior probability associated hypothesis given algorithmic universal probability, sum log universal probability model plus log probability data given model minimized. restrict model class finite sets application ideal principle turns kolmogorov's minimal sufficient statistic. general show data compression almost always best strategy, hypothesis identification prediction.",4 "monitoring term drift based semantic consistency evolving vector field. based aristotelian concept potentiality vs. actuality allowing study energy dynamics language, propose field approach lexical analysis. falling back distributional hypothesis statistically model word meaning, used evolving fields metaphor express time-dependent changes vector space model combination random indexing evolving self-organizing maps (esom). monitor semantic drifts within observation period, experiment carried term space collection 12.8 million amazon book reviews. evaluation, semantic consistency esom term clusters compared respective neighbourhoods wordnet, contrasted distances among term vectors random indexing. found 0.05 level significance, terms clusters showed high level semantic consistency. tracking drift distributional patterns term space across time periods, found consistency decreased, statistically significant level. method highly scalable, interpretations philosophy.",4 "neural speed reading via skim-rnn. inspired principles speed reading, introduce skim-rnn, recurrent neural network (rnn) dynamically decides update small fraction hidden state relatively unimportant input tokens. skim-rnn gives computational advantage rnn always updates entire hidden state. skim-rnn uses input output interfaces standard rnn easily used instead rnns existing models. experiments, show skim-rnn achieve significantly reduced computational cost without losing accuracy compared standard rnns across five different natural language tasks. addition, demonstrate trade-off accuracy speed skim-rnn dynamically controlled inference time stable manner. analysis also shows skim-rnn running single cpu offers lower latency compared standard rnns gpus.",4 "recurrent deep stacking networks speech recognition. paper presented work applying recurrent deep stacking networks (rdsns) robust automatic speech recognition (asr) tasks. paper, also proposed efficient yet comparable substitute rdsn, bi- pass stacking network (bpsn). main idea two models add phoneme-level information acoustic models, transforming acoustic model combination acoustic model phoneme-level n-gram model. experiments showed rdsn bpsn substantially improve performances conventional dnns.",4 "image disguise based generative model. protect image contents, existing encryption algorithms designed transform original image texture-like noise-like image, is, however, obvious visual sign indicating presence encrypted image, results significantly large number attacks. solve problem, paper, propose new image encryption method generate visually image original one sending meaning-normal independent image corresponding well-trained generative model achieve effect disguising original image. image disguise method solves problem obvious visual implication, also guarantees security information.",4 "candidates v.s. noises estimation large multi-class classification problem. paper proposes method multi-class classification problems, number classes $k$ large. method, referred {\em candidates v.s. noises estimation} (cane), selects small subset candidate classes samples remaining classes. show cane always consistent computationally efficient. moreover, resulting estimator low statistical variance approaching maximum likelihood estimator, observed label belongs selected candidates high probability. practice, use tree structure leaves classes promote fast beam search candidate selection. also apply cane method estimate word probabilities neural language models. experiments show cane achieves better prediction accuracy noise-contrastive estimation (nce), variants number state-of-the-art tree classifiers, gains significant speedup compared standard $\mathcal{o}(k)$ methods.",19 "distance function numbers. dempster-shafer theory widely applied uncertainty modelling knowledge reasoning due ability expressing uncertain information. distance two basic probability assignments(bpas) presents measure performance identification algorithms based evidential theory dempster-shafer. however, conditions lead limitations practical application dempster-shafer theory, exclusiveness hypothesis completeness constraint. overcome shortcomings, novel theory called numbers theory proposed. distance function numbers proposed measure distance two numbers. distance function numbers generalization distance two bpas, inherits advantage dempster-shafer theory strengthens capability uncertainty modeling. illustrative case provided demonstrate effectiveness proposed function.",4 "efficient effective single-document summarizations word-embedding measurement quality. task generate effective summary given document specific realtime requirements. use softplus function enhance keyword rankings favor important sentences, based present number summarization algorithms using various keyword extraction topic clustering methods. show algorithms meet realtime requirements yield best rouge recall scores duc-02 previously-known algorithms. show algorithms meet realtime requirements yield best rouge recall scores duc-02 previously-known algorithms. evaluate quality summaries without human-generated benchmarks, define measure called wesm based word-embedding using word mover's distance. show orderings rouge wesm scores algorithms highly comparable, suggesting wesm may serve viable alternative measuring quality summary.",4 "cost adaptation robust decentralized swarm behaviour. multi-agent swarm system robust paradigm drive efficient completion complex tasks even energy limitations time constraints. however, coordination swarm centralized command center difficult, particularly swarm becomes large spans wide ranges. here, leverage propagation messages based mesh-networking protocols global communication swarm online cost-optimization decentralized receding horizon control drive decentralized decision-making. cost-based formulation allows wide range tasks encoded. ensure this, implement method adaptation costs constraints ensures effectiveness novel tasks, network delays, heterogeneous flight capabilities, increasingly large swarms. use unity3d game engine build simulator capable introducing artificial networking failures delays swarm. using simulator validate method using example coordinated exploration task. release simulator code community future work.",4 "prototype knowledge-based programming environment. paper present proposal knowledge-based programming environment. environment, declarative background knowledge, procedures, concrete data represented suitable languages combined flexible manner. leads highly declarative programming style. illustrate approach example report prototype implementation.",4 "approximate muscle guided beam search three-index assignment problem. well-known np-hard problem, three-index assignment problem (ap3) attracted lots research efforts developing heuristics. however, existing heuristics either obtain less competitive solutions consume much time. paper, new heuristic named approximate muscle guided beam search (ambs) developed achieve good trade-off solution quality running time. combining approximate muscle beam search, solution space size significantly decreased, thus time searching solution sharply reduced. extensive experimental results benchmark indicate new algorithm able obtain solutions competitive quality employed instances largescale. work paper proposes new efficient heuristic, also provides promising method improve efficiency beam search.",4 "doctag2vec: embedding based multi-label learning approach document tagging. tagging news articles blog posts relevant tags collection predefined ones coined document tagging work. accurate tagging articles benefit several downstream applications recommendation search. work, propose novel yet simple approach called doctag2vec accomplish task. substantially extend word2vec doc2vec---two popular models learning distributed representation words documents. doctag2vec, simultaneously learn representation words, documents, tags joint vector space training, employ simple $k$-nearest neighbor search predict tags unseen documents. contrast previous multi-label learning methods, doctag2vec directly deals raw text instead provided feature vector, addition, enjoys advantages like learning tag representation, ability handling newly created tags. demonstrate effectiveness approach, conduct experiments several datasets show promising results state-of-the-art methods.",4 "proposing lt based search pdm systems better information retrieval. pdm systems contain manage heavy amount data search mechanism systems intelligent process user""s natural language based queries extract desired information. currently available search mechanisms almost pdm systems efficient based old ways searching information entering relevant information respective fields search forms find specific information attached repositories. targeting issue, thorough research conducted fields pdm systems language technology. concerning pdm system, conducted research provides information pdm pdm systems detail. concerning field language technology, helps implementing search mechanism pdm systems search user""s needed information analyzing user""s natural language based requests. accomplished goal research support field pdm new proposition conceptual model implementation natural language based search. proposed conceptual model successfully designed partially implementation form prototype. describing proposition detail main concept, implementation designs developed prototype proposed approach discussed paper. implemented prototype compared respective functions existing pdm systems .i.e., windchill cim evaluate effectiveness targeted challenges.",4 "distributed air traffic control : human safety perspective. issues air traffic control far addressed intent improve resource utilization achieve optimized solution respect fuel comsumption aircrafts, efficient usage available airspace minimal congestion related losses various dynamic constraints. focus almost always smarter management traffic increase profits human safety, though achieved process, believe, remained less seriously attended. become important given overburdened overstressed air traffic controllers managing hundreds airports thousands aircrafts per day. propose multiagent system based distributed approach handle air traffic ensuring complete human (passenger) safety without removing humans (ground controllers) loop thereby also retaining earlier advantages new solution. detailed design agent system, easily interfacable existing environment, described. based initial findings simulations, strongly believe system capable handling nuances involved, extendable customizable later point time.",4 "representation texts complex networks: mesoscopic approach. statistical techniques analyze texts, referred text analytics, departed use simple word count statistics towards new paradigm. text mining hinges sophisticated set methods, including representations terms complex networks. well-established word-adjacency (co-occurrence) methods successfully grasp syntactical features written texts, unable represent important aspects textual data, topical structure, i.e. sequence subjects developing mesoscopic level along text. aspects often overlooked current methodologies. order grasp mesoscopic characteristics semantical content written texts, devised network model able analyze documents multi-scale fashion. proposed model, limited amount adjacent paragraphs represented nodes, connected whenever share minimum semantical content. illustrate capabilities model, present, case example, qualitative analysis ""alice's adventures wonderland"". show mesoscopic structure document, modeled network, reveals many semantic traits texts. approach paves way myriad semantic-based applications. addition, approach illustrated machine learning context, texts classified among real texts randomized instances.",4 "teaching machines code: neural markup generation visual attention. present deep recurrent neural network model soft visual attention learns generate latex markup real-world math formulas given images. applying neural sequence generation techniques successful fields machine translation image/handwriting/speech captioning, recognition, transcription synthesis, construct image-to-markup model learns produce syntactically semantically correct latex markup code 150 words long achieves bleu score 89%; best reported far im2latex problem. also visually demonstrate model learns scan image left-right / up-down much human would read it.",4 "statistical keyword detection literary corpora. understanding complexity human language requires appropriate analysis statistical distribution words texts. consider information retrieval problem detecting ranking relevant words text means statistical information referring ""spatial"" use words. shannon's entropy information used tool automatic keyword extraction. using origin species charles darwin representative text sample, show performance detector compare another proposals literature. random shuffled text receives special attention tool calibrating ranking indices.",4 "representation embedding knowledge bases beyond binary relations. models developed date knowledge base embedding based assumption relations contained knowledge bases binary. training testing embedding models, multi-fold (or n-ary) relational data converted triples (e.g., fb15k dataset) interpreted instances binary relations. paper presents canonical representation knowledge bases containing multi-fold relations. show existing embedding models popular fb15k datasets correspond sub-optimal modelling framework, resulting loss structural information. advocate novel modelling framework, models multi-fold relations directly using canonical representation. using framework, existing transh model generalized new model, m-transh. demonstrate experimentally m-transh outperforms transh large margin, thereby establishing new state art.",4 "affact - alignment-free facial attribute classification technique. facial attributes soft-biometrics allow limiting search space, e.g., rejecting identities non-matching facial characteristics nose sizes eyebrow shapes. paper, investigate latest versions deep convolutional neural networks, resnets, perform facial attribute classification task. test two loss functions: sigmoid cross-entropy loss euclidean loss, find classification performance little difference two. using ensemble three resnets, obtain new state-of-the-art facial attribute classification error 8.00% aligned images celeba dataset. significantly, introduce alignment-free facial attribute classification technique (affact), data augmentation technique allows network classify facial attributes without requiring alignment beyond detected face bounding boxes. best knowledge, first report similar accuracy using detected bounding boxes -- rather requiring alignment based automatically detected facial landmarks -- improve classification accuracy rotating scaling test images. show approach outperforms celeba baseline unaligned images relative improvement 36.8%.",4 "interpretation mammogram chest x-ray reports using deep neural networks - preliminary results. radiology reports important means communication radiologists physicians. reports express radiologist's interpretation medical imaging examination critical establishing diagnosis formulating treatment plan. paper, propose bi-directional convolutional neural network (bi-cnn) model interpretation classification mammograms based breast density chest radiographic radiology reports based basis chest pathology. proposed approach helps organize databases radiology reports, retrieve expeditiously, evaluate radiology report could used auditing system decrease incorrect diagnoses. study revealed proposed bi-cnn outperforms random forest support vector machine methods.",4 "saliency benchmarking: separating models, maps metrics. field fixation prediction heavily model-driven, dozens new models published every year. however, progress field difficult judge models compared using variety inconsistent metrics. soon saliency map optimized certain metric, penalized metrics. propose principled approach solve benchmarking problem: separate notions saliency models saliency maps. define saliency model probabilistic model fixation density prediction and, inspired bayesian decision theory, saliency map metric-specific prediction derived model density maximizes expected performance metric. derive optimal saliency map commonly used saliency metrics (auc, sauc, nss, cc, sim, kl-div) show computed analytically approximated high precision using model density. show leads consistent rankings metrics avoids penalties using one saliency map metrics. framework, ""good"" models perform well metrics.",4 "selecting best player formation corner-kick situations based bayes' estimation. domain soccer simulation 2d league robocup project, appropriate player positioning given opponent team important factor soccer team performance. work proposes model decides strategy applied regarding particular opponent team. task realized applying preliminary learning phase model determines effective strategies clusters opponent teams. model determines best strategies using sequential bayes' estimators. first trial system, proposed model used determine association player formations opponent teams particular situation corner-kick. implemented model shows satisfying abilities compare player formations similar terms performance determines right ranking even running decent number simulation games.",4 "morally acceptable system lie persuade me?. given fast rise increasingly autonomous artificial agents robots, key acceptability criterion possible moral implications actions. particular, intelligent persuasive systems (systems designed influence humans via communication) constitute highly sensitive topic intrinsically social nature. still, ethical studies area rare tend focus output required action. instead, work focuses persuasive acts (e.g. ""is morally acceptable machine lies appeals emotions person persuade her, even good end?""). exploiting behavioral approach, based human assessment moral dilemmas -- i.e. without prior assumption underlying ethical theories -- paper reports set experiments. experiments address type persuader (human machine), strategies adopted (purely argumentative, appeal positive emotions, appeal negative emotions, lie) circumstances. findings display differences due agent, mild acceptability persuasion reveal truth-conditional reasoning (i.e. argument validity) significant dimension affecting subjects' judgment. implications design intelligent persuasive systems discussed.",4 "large-scale domain adaptation via teacher-student learning. high accuracy speech recognition requires large amount transcribed data supervised training. absence data, domain adaptation well-trained acoustic model performed, even here, high accuracy usually requires significant labeled data target domain. work, propose approach domain adaptation require transcriptions instead uses corpus unlabeled parallel data, consisting pairs samples source domain well-trained model desired target domain. perform adaptation, employ teacher/student (t/s) learning, posterior probabilities generated source-domain model used lieu labels train target-domain model. evaluate proposed approach two scenarios, adapting clean acoustic model noisy speech adapting adults speech acoustic model children speech. significant improvements accuracy obtained, reductions word error rate 44% original source model without need transcribed data target domain. moreover, show increasing amount unlabeled data results additional model robustness, particularly beneficial using simulated training data target-domain.",4 "temporal human action segmentation via dynamic clustering. present effective dynamic clustering algorithm task temporal human action segmentation, comprehensive applications robotics, motion analysis, patient monitoring. proposed algorithm unsupervised, fast, generic process various types features, applicable online offline settings. perform extensive experiments processing data streams, show algorithm achieves state-of-the-art results online offline settings.",4 "indefinite kernel logistic regression. traditionally, kernel learning methods requires positive definitiveness kernel, strict excludes many sophisticated similarities, indefinite, multimedia area. utilize indefinite kernels, indefinite learning methods great interests. paper aims extension logistic regression positive semi-definite kernels indefinite kernels. model, called indefinite kernel logistic regression (iklr), keeps consistency regular klr formulation essentially becomes non-convex. thanks positive decomposition indefinite matrix, iklr transformed difference two convex models, follows use concave-convex procedure. moreover, employ inexact solving scheme speed sub-problem develop concave-inexact-convex procedure (ccicp) algorithm theoretical convergence analysis. systematical experiments multi-modal datasets demonstrate superiority proposed iklr method kernel logistic regression positive definite kernels state-of-the-art indefinite learning based algorithms.",4 "well-typed lightweight situation calculus. situation calculus widely applied artificial intelligence related fields. formalism considered dialect logic programming language mostly used dynamic domain modeling. however, type systems hardly deployed situation calculus literature. achieve correct sound typed program written situation calculus, adding typing elements current situation calculus quite helpful. paper, propose add typing mechanisms current version situation calculus, especially three basic elements situation calculus: situations, actions objects, perform rigid type checking existing situation calculus programs find well-typed ill-typed ones. way, type correctness soundness situation calculus programs guaranteed type checking based type system. modified version lightweight situation calculus proved robust well-typed system.",4 "online adaptive pseudoinverse solutions elm weights. elm method become widely used classification regressions problems result accuracy, simplicity ease use. solution hidden layer weights means matrix pseudoinverse operation significant contributor utility method; however, conventional calculation pseudoinverse means singular value decomposition (svd) always practical large data sets online updates solution. paper discuss incremental methods solving pseudoinverse suitable elm. show careful choice methods allows us optimize accuracy, ease computation, adaptability solution.",4 "constraint propagation first-order logic inductive definitions. constraint propagation one basic forms inference many logic-based reasoning systems. paper, investigate constraint propagation first-order logic (fo), suitable language express wide variety constraints. present algorithm polynomial-time data complexity constraint propagation context fo theory finite structure. show constraint propagation manner represented datalog program algorithm executed symbolically, i.e., independently structure. next, extend algorithm fo(id), extension fo inductive definitions. finally, discuss several applications.",4 "comparative studies decentralized multiloop pid controller design using evolutionary algorithms. decentralized pid controllers designed paper simultaneous tracking individual process variables multivariable systems step reference input. controller design framework takes account minimization weighted sum integral time multiplied squared error (itse) integral squared controller output (isco) balance overall tracking errors process variables required variation corresponding manipulated variables. decentralized pid gains tuned using three popular evolutionary algorithms (eas) viz. genetic algorithm (ga), evolutionary strategy (es) cultural algorithm (ca). credible simulation comparisons reported four benchmark 2x2 multivariable processes.",4 "multiresolution hierarchical analysis astronomical spectroscopic cubes using 3d discrete wavelet transform. intrinsically hierarchical blended structure interstellar molecular clouds, plus always increasing resolution astronomical instruments, demand advanced automated pattern recognition techniques identifying connecting source components spectroscopic cubes. extend work done multiresolution analysis using wavelets astronomical 2d images 3d spectroscopic cubes, combining results dendrograms approach offer hierarchical representation connections sources different scale levels. test approach real data alma observatory, exploring different wavelet families assessing main parameter source identification (i.e., rms) level. approach shows feasible perform multiresolution analysis spatial frequency domains simultaneously rather analyzing spectral channel independently.",4 "unified approach error bounds structured convex optimization problems. error bounds, refer inequalities bound distance vectors test set given set residual function, proven extremely useful analyzing convergence rates host iterative methods solving optimization problems. paper, present new framework establishing error bounds class structured convex optimization problems, objective function sum smooth convex function general closed proper convex function. class encapsulates fairly general constrained minimization problems also various regularized loss minimization formulations machine learning, signal processing, statistics. using framework, show number existing error bound results recovered unified transparent manner. demonstrate power framework, apply class nuclear-norm regularized loss minimization problems establish new error bound class strict complementarity-type regularity condition. complement result constructing example show said error bound could fail hold without regularity condition. consequently, obtain rather complete answer question raised tseng. believe approach find applications study error bounds structured convex optimization problems.",12 "face segmentation, face swapping, face perception. show even face images unconstrained arbitrarily paired, face swapping actually quite simple. end, make following contributions. (a) instead tailoring systems face segmentation, others previously proposed, show standard fully convolutional network (fcn) achieve remarkably fast accurate segmentations, provided trained rich enough example set. purpose, describe novel data collection generation routines provide challenging segmented face examples. (b) use segmentations enable robust face swapping unprecedented conditions. (c) unlike previous work, swapping robust enough allow extensive quantitative tests. end, use labeled faces wild (lfw) benchmark measure effect intra- inter-subject face swapping recognition. show intra-subject swapped faces remain recognizable sources, testifying effectiveness method. line well known perceptual studies, show better face swapping produces less recognizable inter-subject results. first time effect quantitatively demonstrated machine vision systems.",4 "homomorphic signal processing deep neural networks: constructing deep algorithms polyphonic music transcription. paper presents new approach understanding deep neural networks (dnns) work applying homomorphic signal processing techniques. focusing task multi-pitch estimation (mpe), paper demonstrates equivalence relation generalized cepstrum dnn terms structures functionality. equivalence relation, together pitch perception theories recently established rectified-correlations-on-a-sphere (recos) filter analysis, provide alternative way explaining role nonlinear activation function multi-layer structure, exist cepstrum dnn. validate efficacy new approach, new feature designed fashion proposed pitch salience function. new feature outperforms one-layer spectrum mpe task and, predicted, addresses issue missing fundamental effect also achieves better robustness noise.",4 "graph partitioning via parallel submodular approximation accelerate distributed machine learning. distributed computing excels processing large scale data, communication cost synchronizing shared parameters may slow overall performance. fortunately, interactions parameter data many problems sparse, admits efficient partition order reduce communication overhead. paper, formulate data placement graph partitioning problem. propose distributed partitioning algorithm. give theoretical guarantees highly efficient implementation. also provide highly efficient implementation algorithm demonstrate promising results text datasets social networks. show proposed algorithm leads 1.6x speedup state-of-the-start distributed machine learning system eliminating 90\% network communication.",4 "fast eigenspace approximation using random signals. focus work estimation first $k$ eigenvectors graph laplacian using filtering gaussian random signals. prove need $k$ signals able exactly recover many smallest eigenvectors, regardless number nodes graph. addition, address key issues implementing theoretical concepts practice using accurate approximated methods. also propose fast algorithms eigenspace approximation determination $k$th smallest eigenvalue $\lambda_k$. latter proves extremely efficient assumption locally uniform distribution eigenvalue spectrum. finally, present experiments show validity method practice compare state-of-the-art methods clustering visualization synthetic small-scale datasets larger real-world problems millions nodes. show method allows better scaling number nodes previous methods achieving almost perfect reconstruction eigenspace formed first $k$ eigenvectors.",4 "linear shift invariant multiscale transform. paper presents multiscale decomposition algorithm. unlike standard wavelet transforms, proposed operator linear shift invariant. central idea obtain shift invariance averaging aligned wavelet transform projections circular shifts signal. shown transform obtained linear filter bank.",4 "parallel corpus translationese. describe set bilingual english--french english--german parallel corpora direction translation accurately reliably annotated. corpora diverse, consisting parliamentary proceedings, literary works, transcriptions ted talks political commentary. instrumental research translationese applications (human machine) translation; specifically, used task translationese identification, research direction enjoys growing interest recent years. validate quality reliability corpora, replicated previous results supervised unsupervised identification translationese, extended experiments additional datasets languages.",4 "lifted region-based belief propagation. due intractable nature exact lifted inference, research recently focused discovery accurate efficient approximate inference algorithms statistical relational models (srms), lifted first-order belief propagation. fobp simulates propositional factor graph belief propagation without constructing ground factor graph identifying lifting redundant message computations. work, propose generalization fobp called lifted generalized belief propagation, region structure message structure lifted. approach allows inference performed intra-region (in exact inference step bp), thereby allowing simulation propagation graph structure larger region scopes fewer edges, still maintaining tractability. demonstrate resulting algorithm converges fewer iterations accurate results variety srms.",4 "pediatric bone age assessment using deep convolutional neural networks. skeletal bone age assessment common clinical practice diagnose endocrine metabolic disorders child development. paper, describe fully automated deep learning approach problem bone age assessment using data pediatric bone age challenge organized rsna 2017. dataset competition consisted 12.6k radiological images left hand labeled bone age sex patients. approach utilizes several deep learning architectures: u-net, resnet-50, custom vgg-style neural networks trained end-to-end. use images whole hands well specific parts hand training inference. approach allows us measure importance specific hand bones automated bone age analysis. evaluate performance method context skeletal development stages. approach outperforms common methods bone age assessment.",4 parameterized complexity results symmetry breaking. symmetry common feature many combinatorial problems. unfortunately eliminating symmetry problem often computationally intractable. paper argues recent parameterized complexity results provide insight intractability help identify special cases symmetry dealt tractably,4 "polyglot semantic parsing apis. traditional approaches semantic parsing (sp) work training individual models available parallel dataset text-meaning pairs. paper, explore idea polyglot semantic translation, learning semantic parsing models trained multiple datasets natural languages. particular, focus translating text code signature representations using software component datasets richardson kuhn (2017a,b). advantage models used parsing wide variety input natural languages output programming languages, mixed input languages, using single unified model. facilitate modeling type, develop novel graph-based decoding framework achieves state-of-the-art performance datasets, apply method two benchmark sp tasks.",4 "object recognition imperfect perception redundant description. paper deals scene recognition system robotics contex. general problem match images a priori descriptions. typical mission would consist identifying object installation vision system situated end manipulator human operator provided description, formulated pseudo-natural language, possibly redundant. originality work comes nature description, special attention given management imprecision uncertainty interpretation process way assess description redundancy reinforce overall matching likelihood.",4 "device? - detecting smart device's wearing location context active safety vulnerable road users. article describes approach detect wearing location smart devices worn pedestrians cyclists. detection, based solely sensors smart devices, important context-information used parametrize subsequent algorithms, e.g. dead reckoning intention detection improve safety vulnerable road users. wearing location recognition terms organic computing (oc) seen step towards self-awareness self-adaptation. wearing location detection two-stage process presented. subdivided moving detection followed wearing location classification. finally, approach evaluated real world dataset consisting pedestrians cyclists.",4 "stability phase retrievable frames. paper study property phase retrievability redundant sysems vectors perturbations frame set. specifically show set $\fc$ $m$ vectors complex hilbert space dimension n allows vector reconstruction magnitudes coefficients, perturbation bound $\rho$ frame set within $\rho$ $\fc$ property. particular proves recent construction \cite{bh13} stable perturbations. token reduce critical cardinality conjectured \cite{bcmn13a} proving stability result non phase-retrievable frames.",12 "global preferential consistency topological sorting-based maximal spanning tree problem. introduce new type fully computable problems, dss dedicated maximal spanning tree problems, based deduction choice: preferential consistency problems. show interest, describe new compact representation preferences specific spanning trees, identifying efficient maximal spanning tree sub-problem. next, compare problem pareto-based multiobjective one. last, propose efficient algorithm solving associated preferential consistency problem.",4 "non-sparse linear representations visual tracking online reservoir metric learning. sparse linear representation-based trackers need solve computationally expensive l1-regularized optimization problem. address problem, propose visual tracker based non-sparse linear representations, admit efficient closed-form solution without sacrificing accuracy. moreover, order capture correlation information different feature dimensions, learn mahalanobis distance metric online fashion incorporate learned metric optimization problem obtaining linear representation. show online metric learning using proximity comparison significantly improves robustness tracking, especially sequences exhibiting drastic appearance changes. furthermore, order prevent unbounded growth number training samples metric learning, design time-weighted reservoir sampling method maintain update limited-sized foreground background sample buffers balancing sample diversity adaptability. experimental results challenging videos demonstrate effectiveness robustness proposed tracker.",4 "investigation report auction mechanism design. auctions markets strict regulations governing information available traders market possible actions take. since well designed auctions achieve desirable economic outcomes, widely used solving real-world optimization problems, structuring stock futures exchanges. auctions also provide valuable testing-ground economic theory, play important role computer-based control systems. auction mechanism design aims manipulate rules auction order achieve specific goals. economists traditionally use mathematical methods, mainly game theory, analyze auctions design new auction forms. however, due high complexity auctions, mathematical models typically simplified obtain results, makes difficult apply results derived models market environments real world. result, researchers turning empirical approaches. report aims survey theoretical empirical approaches designing auction mechanisms trading strategies weights empirical ones, build foundation research field.",4 "flower pollination algorithm: novel approach multiobjective optimization. multiobjective design optimization problems require multiobjective optimization techniques solve, often challenging obtain high-quality pareto fronts accurately. paper, recently developed flower pollination algorithm (fpa) extended solve multiobjective optimization problems. proposed method used solve set multobjective test functions two bi-objective design benchmarks, comparison proposed algorithm algorithms made, shows fpa efficient good convergence rate. finally, importance parametric studies theoretical analysis highlighted discussed.",12 "polynomial neural networks learnt classify eeg signals. neural network based technique presented, able successfully extract polynomial classification rules labeled electroencephalogram (eeg) signals. represent classification rules analytical form, use polynomial neural networks trained modified group method data handling (gmdh). classification rules extracted clinical eeg data recorded alzheimer patient sudden death risk patients. third data eeg recordings include normal artifact segments. eeg data visually identified medical experts. extracted polynomial rules verified testing eeg data allow correctly classify 72% risk group patients 96.5% segments. rules performs slightly better standard feedforward neural networks.",4 "reducing computational cost multi-objective evolutionary algorithms filtering worthless individuals. large number exact fitness function evaluations makes evolutionary algorithms computational cost. real-world problems, reducing number evaluations much valuable even increasing computational complexity spending time. fulfill target, introduce effective factor, spite applied factor adaptive fuzzy fitness granulation non-dominated sorting genetic algorithm-ii, filter worthless individuals precisely. proposed approach compared respect adaptive fuzzy fitness granulation non-dominated sorting genetic algorithm-ii, using hyper volume inverted generational distance performance measures. proposed method applied 1 traditional 1 state-of-the-art benchmarks considering 3 different dimensions. average performance view, results indicate although decreasing number fitness evaluations leads performance reduction tangible compared gain.",4 "fractal dimension based optimal wavelet packet analysis technique classification meningioma brain tumours. heterogeneous nature tissue texture, using single resolution approach optimum classification might suffice. contrast, multiresolution wavelet packet analysis decompose input signal set frequency subbands giving opportunity characterise texture appropriate frequency channel. adaptive best bases algorithm optimal bases selection meningioma histopathological images proposed, via applying fractal dimension (fd) bases selection criterion tree-structured manner. thereby, significant subband better identifies texture discontinuities chosen decomposition, fractal signature would represent extracted feature vector classification. best basis selection using fd outperformed energy based selection approaches, achieving overall classification accuracy 91.25% compared 83.44% 73.75% co-occurrence matrix energy texture signatures; respectively.",4 "deep reinforcement learning unsupervised video summarization diversity-representativeness reward. video summarization aims facilitate large-scale video browsing producing short, concise summaries diverse representative original videos. paper, formulate video summarization sequential decision-making process develop deep summarization network (dsn) summarize videos. dsn predicts video frame probability, indicates likely frame selected, takes actions based probability distributions select frames, forming video summaries. train dsn, propose end-to-end, reinforcement learning-based framework, design novel reward function jointly accounts diversity representativeness generated summaries rely labels user interactions all. training, reward function judges diverse representative generated summaries are, dsn strives earning higher rewards learning produce diverse representative summaries. since labels required, method fully unsupervised. extensive experiments two benchmark datasets show unsupervised method outperforms state-of-the-art unsupervised methods, also comparable even superior published supervised approaches.",4 "assumptions behind dempster's rule. paper examines concept combination rule belief functions. shown two fairly simple apparently reasonable assumptions determine dempster's rule, giving new justification it.",4 "optimal algorithm thresholding bandit problem. study specific \textit{combinatorial pure exploration stochastic bandit problem} learner aims finding set arms whose means given threshold, given precision, \textit{for fixed time horizon}. propose parameter-free algorithm based original heuristic, prove optimal problem deriving matching upper lower bounds. best knowledge, first non-trivial pure exploration setting \textit{fixed budget} optimal strategies constructed.",19 survey deep learning techniques mobile robot applications. advancements deep learning years attracted research deep artificial neural networks used robotic systems. research survey present summarization current research specific focus gains obstacles deep learning applied mobile robotics.,4 "proceedings workshop brain analysis using connectivity networks - bacon 2016. understanding brain connectivity network-theoretic context shown much promise recent years. type analysis identifies brain organisational principles, bringing new perspective neuroscience. time, large public databases connectomic data available. however, connectome analysis still emerging field crucial need robust computational methods fully unravelits potential. workshop provides platform discuss development new analytic techniques; methods evaluating validating commonly used approaches; well effects variations pre-processing steps.",4 "forecasting sleep apnea dynamic network models. dynamic network models (dnms) belief networks temporal reasoning. dnm methodology combines techniques time series analysis probabilistic reasoning provide (1) knowledge representation integrates noncontemporaneous contemporaneous dependencies (2) methods iteratively refining dependencies response effects exogenous influences. use belief-network inference algorithms perform forecasting, control, discrete event simulation dnms. belief network formulation allows us move beyond traditional assumptions linearity relationships among time-dependent variables normality probability distributions. demonstrate dnm methodology important forecasting problem medicine. conclude discussion methodology addresses several limitations found traditional time series analyses.",4 "extended comment language trees zipping. extended version comment submitted physical review letters. first point inappropriateness publishing letter unrelated physics. next, give experimental results showing technique used letter 3 times worse 17 times slower simple baseline. finally, review literature, showing ideas letter novel. conclude suggesting physical review letters publish letters unrelated physics.",3 "automatic data deformation analysis evolving folksonomy driven environment. folksodriven framework makes possible data scientists define ontology environment searching buried patterns kind predictive power build predictive models effectively. accomplishes abstractions isolate parameters predictive modeling process searching patterns designing feature set, too. reflect evolving knowledge, paper considers ontologies based folksonomies according new concept structure called ""folksodriven"" represent folksonomies. so, studies transformational regulation folksodriven tags regarded important adaptive folksonomies classifications evolving environment used intelligent systems represent knowledge sharing. folksodriven tags used categorize salient data points fed machine-learning system ""featurizing"" data.",4 "meta networks. neural networks successfully applied applications large amount labeled data. however, task rapid generalization new concepts small training data preserving performances previously learned ones still presents significant challenge neural network models. work, introduce novel meta learning method, meta networks (metanet), learns meta-level knowledge across tasks shifts inductive biases via fast parameterization rapid generalization. evaluated omniglot mini-imagenet benchmarks, metanet models achieve near human-level performance outperform baseline approaches 6% accuracy. demonstrate several appealing properties metanet relating generalization continual learning.",4 "ltsg: latent topical skip-gram mutually learning topic model vector representations. topic models widely used discovering latent topics shared across documents text mining. vector representations, word embeddings topic embeddings, map words topics low-dimensional dense real-value vector space, obtained high performance nlp tasks. however, existing models assume result trained one perfect correct used prior knowledge improving model. models use information trained external large corpus help improving smaller corpus. paper, aim build algorithm framework makes topic models vector representations mutually improve within corpus. em-style algorithm framework employed iteratively optimize topic model vector representations. experimental results show model outperforms state-of-art methods various nlp tasks.",4 "heron inference bayesian graphical models. bayesian graphical models shown powerful tool discovering uncertainty causal structure real-world data many application fields. current inference methods primarily follow different kinds trade-offs computational complexity predictive accuracy. one end spectrum, variational inference approaches perform well computational efficiency, end, gibbs sampling approaches known relatively accurate prediction practice. paper, extend existing gibbs sampling method, propose new deterministic heron inference (heron) family bayesian graphical models. addition support nontrivial distributability, one benefit heron able allow us easily assess convergence status also largely improve running efficiency. evaluate heron standard collapsed gibbs sampler state-of-the-art state augmentation method inference well-known graphical models. experimental results using publicly available real-life data demonstrated heron significantly outperforms baseline methods inferring bayesian graphical models.",4 "all-in-one convolutional neural network face analysis. present multi-purpose algorithm simultaneous face detection, face alignment, pose estimation, gender recognition, smile detection, age estimation face recognition using single deep convolutional neural network (cnn). proposed method employs multi-task learning framework regularizes shared parameters cnn builds synergy among different domains tasks. extensive experiments show network better understanding face achieves state-of-the-art result tasks.",4 "frugal bribery voting. bribery elections important problem computational social choice theory. however, bribery money often illegal elections. motivated this, introduce notion frugal bribery formulate two new pertinent computational problems call frugal-bribery frugal- $bribery capture bribery without money elections. proposed model, briber frugal nature captured inability bribe votes certain kind, namely, non-vulnerable votes. frugal-bribery problem, goal make certain candidate win election changing vulnerable votes. frugal-{dollar}bribery problem, vulnerable votes prices goal make certain candidate win election changing vulnerable votes, subject budget constraint briber. formulate two natural variants frugal-{dollar}bribery problem namely uniform-frugal-{dollar}bribery nonuniform-frugal-{dollar}bribery prices vulnerable votes are, respectively, different. study computational complexity problems unweighted weighted elections several commonly used voting rules. observe that, even small number candidates, problems intractable voting rules studied weighted elections, sole exception frugal-bribery problem plurality voting rule. contrast, polynomial time algorithms frugal-bribery problem plurality, veto, k-approval, k-veto, plurality runoff voting rules unweighted elections. however, frugal-{dollar}bribery problem intractable voting rules studied barring plurality veto voting rules unweighted elections.",4 "generalised seizure prediction convolutional neural networks intracranial scalp electroencephalogram data analysis. seizure prediction attracted growing attention one challenging predictive data analysis efforts order improve life patients living drug-resistant epilepsy tonic seizures. many outstanding works reporting great results providing sensible indirect (warning systems) direct (interactive neural-stimulation) control refractory seizures, achieved high performance. however, many works put heavily handcraft feature extraction and/or carefully tailored feature engineering patient achieve high sensitivity low false prediction rate particular dataset. limits benefit approaches different dataset used. paper apply convolutional neural networks (cnns) different intracranial scalp electroencephalogram (eeg) datasets proposed generalized retrospective patient-specific seizure prediction method. use short-time fourier transform (stft) 30-second eeg windows 50% overlapping extract information frequency time domains. standardization step applied stft components across whole frequency range prevent high frequencies features influenced lower frequencies. convolutional neural network model used feature extraction classification separate preictal segments interictal ones. proposed approach achieves sensitivity 81.4%, 81.2%, 82.3% false prediction rate (fpr) 0.06/h, 0.16/h, 0.22/h freiburg hospital intracranial eeg (ieeg) dataset, children's hospital boston-mit scalp eeg (seeg) dataset, kaggle american epilepsy society seizure prediction challenge's dataset, respectively. prediction method also statistically better unspecific random predictor patients three datasets.",4 "mindx: denoising mixed impulse poisson-gaussian noise using proximal algorithms. present novel algorithm blind denoising images corrupted mixed impulse, poisson, gaussian noises. algorithm starts applying anscombe variance-stabilizing transformation convert poisson white gaussian noise. applies combinatorial optimization technique denoise mixed impulse gaussian noise using proximal algorithms. result processed inverse anscombe transform. compare algorithm state art methods standard images, show superior performance various noise conditions.",4 "data-dependent kernels nearly-linear time. propose method efficiently construct data-dependent kernels make use large quantities (unlabeled) data. construction makes approximation standard construction semi-supervised kernels sindhwani et al. 2005. typical cases kernels computed nearly-linear time (in amount data), improving cubic time standard construction, enabling large scale semi-supervised learning variety contexts. methods validated semi-supervised unsupervised problems data sets containing upto 64,000 sample points.",4 "sparse low-rank approximations large symmetric matrices using biharmonic interpolation. symmetric matrices widely used machine learning problems kernel machines manifold learning. using large datasets often requires computing low-rank approximations symmetric matrices fit memory. paper, present novel method based biharmonic interpolation low-rank matrix approximation. method exploits knowledge data manifold learn interpolation operator approximates values using subset randomly selected landmark points. operator readily sparsified, reducing memory requirements least two orders magnitude without significant loss accuracy. show method approximate large datasets using twenty times landmarks methods. further, numerical results suggest method stable even numerical difficulties arise methods.",19 "empowering olac extension using anusaaraka effective text processing using double byte coding. paper reviews hurdles trying implement olac extension dravidian / indian languages. paper explores possibilities could minimise solve problems. context, chinese system text processing anusaaraka system scrutinised.",4 "path-based vs. distributional information recognizing lexical semantic relations. recognizing various semantic relations terms beneficial many nlp tasks. path-based distributional information sources considered complementary task, superior results latter showed recently suggested former's contribution might become obsolete. follow recent success integrated neural method hypernymy detection (shwartz et al., 2016) extend recognize multiple relations. empirical results show method effective multiclass setting well. show path-based information source always contributes classification, analyze cases mostly complements distributional information.",4 "fusing continuous-valued medical labels using bayesian model. rapid increase volume time series medical data available wearable devices, need employ automated algorithms label data. examples labels include interventions, changes activity (e.g. sleep) changes physiology (e.g. arrhythmias). however, automated algorithms tend unreliable resulting lower quality care. expert annotations scarce, expensive, prone significant inter- intra-observer variance. address problems, bayesian continuous-valued label aggregator(bcla) proposed provide reliable estimation label aggregation accurately infer precision bias algorithm. bcla applied qt interval (pro-arrhythmic indicator) estimation electrocardiogram using labels 2006 physionet/computing cardiology challenge database. compared mean, median, previously proposed expectation maximization (em) label aggregation approaches. accurately predicting labelling algorithm's bias precision, root-mean-square error bcla 11.78$\pm$0.63ms, significantly outperforming best challenge entry (15.37$\pm$2.13ms) well em, mean, median voting strategies (14.76$\pm$0.52ms, 17.61$\pm$0.55ms, 14.43$\pm$0.57ms respectively $p<0.0001$).",4 "prior matters: simple general methods evaluating improving topic quality topic modeling. latent dirichlet allocation (lda) models trained without stopword removal often produce topics high posterior probabilities uninformative words, obscuring underlying corpus content. even canonical stopwords manually removed, uninformative words common corpus still dominate probable words topic. work, first show standard topic quality measures coherence pointwise mutual information act counter-intuitively presence common irrelevant words, making difficult even quantitatively identify situations topics may dominated stopwords. propose additional topic quality metric targets stopword problem, show it, unlike standard measures, correctly correlates human judgements quality. also propose simple-to-implement strategy generating topics evaluated much higher quality human assessment new metric. approach, collection informative priors easily introduced lda-style inference methods, automatically promotes terms domain relevance demotes domain-specific stop words. demonstrate approach's effectiveness three different domains: department labor accident reports, online health forum posts, nips abstracts. overall find current practices thought solve problem adequately, proposal offers substantial improvement interested interpreting topics objects right.",4 "stacking-based deep neural network: deep analytic network convolutional spectral histogram features. stacking-based deep neural network (s-dnn), general, denotes deep neural network (dnn) resemblance terms deep, feedforward network architecture. typical s-dnn aggregates variable number individually learnable modules series assemble dnn-alike alternative targeted object recognition tasks. work likewise devises s-dnn instantiation, dubbed deep analytic network (dan), top spectral histogram (sh) features. dan learning principle relies ridge regression, key dnn constituents, specifically, rectified linear unit, fine-tuning, normalization. dan aptitude scrutinized three repositories varying domains, including feret (faces), mnist (handwritten digits), cifar10 (natural objects). empirical results unveil dan escalates sh baseline performance sufficiently deep layer.",4 long-term evolution genetic programming populations. evolve binary mux-6 trees 100000 generations evolving programs hundred million nodes. unbounded long-term evolution experiment ltee gp appears evolve building blocks suggests limit bloat. see periods tens even hundreds generations population 100 percent functionally converged. distribution tree sizes predicted theory.,4 "distributed adaptive lmf algorithm sparse parameter estimation gaussian mixture noise. distributed adaptive algorithm estimation sparse unknown parameters presence nongaussian noise proposed paper based normalized least mean fourth (nlmf) criterion. first step, local adaptive nlmf algorithm modified zero norm order speed convergence rate also reduce steady state error power sparse conditions. then, proposed algorithm extended distributed scenario improvement estimation performance achieved due cooperation local adaptive filters. simulation results show superiority proposed algorithm comparison conventional nlmf algorithms.",4 "explorative data analysis changes neural activity. neural recordings nonstationary time series, i.e. properties typically change time. identifying specific changes, e.g. induced learning task, shed light underlying neural processes. however, changes interest often masked strong unrelated changes, physiological origin due measurement artifacts. propose novel algorithm disentangling different causes non-stationarity manner enable better neurophysiological interpretation wider set experimental paradigms. key ingredient repeated application stationary subspace analysis (ssa) using different temporal scales. usefulness explorative approach demonstrated simulations, theory eeg experiments 80 brain-computer-interfacing (bci) subjects.",16 "stepwise regression unsupervised learning. consider unsupervised extensions fast stepwise linear regression algorithm \cite{efroymson1960multiple}. extensions allow one efficiently identify highly-representative feature variable subsets within given set jointly distributed variables. turn allows efficient dimensional reduction large data sets via removal redundant features. fast search effected avoidance repeat computations across trial fits, allowing full representative-importance ranking set feature variables carried $o(n^2 m)$ time, $n$ number variables $m$ number data samples available. runtime complexity matches needed carry single regression $o(n^2)$ faster naive implementations. present pseudocode suitable efficient forward, reverse, forward-reverse unsupervised feature selection. illustrate algorithm's application, apply problem identifying representative stocks within given financial market index -- challenge relevant design exchange traded funds (etfs). also characterize growth numerical error iteration step algorithms, finally demonstrate rationalize observation forward reverse algorithms return exactly inverted feature orderings weakly-correlated feature set regime.",4 "missforest - nonparametric missing value imputation mixed-type data. modern data acquisition based high-throughput technology often facing problem missing data. algorithms commonly used analysis large-scale data often depend complete set. missing value imputation offers solution problem. however, majority available imputation methods restricted one type variable only: continuous categorical. mixed-type data different types usually handled separately. therefore, methods ignore possible relations variable types. propose nonparametric method cope different types variables simultaneously. compare several state art methods imputation missing values. propose evaluate iterative imputation method (missforest) based random forest. averaging many unpruned classification regression trees random forest intrinsically constitutes multiple imputation scheme. using built-in out-of-bag error estimates random forest able estimate imputation error without need test set. evaluation performed multiple data sets coming diverse selection biological fields artificially introduced missing values ranging 10% 30%. show missforest successfully handle missing values, particularly data sets including different types variables. comparative study missforest outperforms methods imputation especially data settings complex interactions nonlinear relations suspected. out-of-bag imputation error estimates missforest prove adequate settings. additionally, missforest exhibits attractive computational efficiency cope high-dimensional data.",19 "ensemble classifier approach breast cancer detection malignancy grading- review. diagnosed cases breast cancer increasing annually unfortunately getting converted high mortality rate. cancer, early stages, hard detect malicious cells show similar properties (density) shown non-malicious cells. mortality ratio could minimized breast cancer could detected early stages. current systems able achieve fully automatic system capable detecting breast cancer also detect stage it. estimation malignancy grading important diagnosing degree growth malicious cells well selecting proper therapy patient. therefore, complete efficient clinical decision support system proposed capable achieving breast cancer malignancy grading scheme efficiently. system based image processing machine learning domains. classification imbalance problem, machine learning problem, occurs instances one class much higher instances class resulting inefficient classification samples hence bad decision support system. therefore eusboost, ensemble based classifier proposed efficient able outperform classifiers takes benefits both-boosting algorithm random undersampling techniques. also comparison eusboost techniques shown paper.",4 "supervised feature selection diagnosis coronary artery disease based genetic algorithm. feature selection (fs) become focus much research decision support systems areas data sets tremendous number variables analyzed. paper present new method diagnosis coronary artery diseases (cad) founded genetic algorithm (ga) wrapped bayes naive (bn) based fs. basically, cad dataset contains two classes defined 13 features. ga bn algorithm, ga generates iteration subset attributes evaluated using bn second step selection procedure. final set attribute contains relevant feature model increases accuracy. algorithm case produces 85.50% classification accuracy diagnosis cad. thus, asset algorithm compared use support vector machine (svm), multilayer perceptron (mlp) c4.5 decision tree algorithm. result classification accuracy algorithms respectively 83.5%, 83.16% 80.85%. consequently, ga wrapped bn algorithm correspondingly compared fs algorithms. obtained results shown promising outcomes diagnosis cad.",4 "dynamic pricing demand covariates. consider firm sells products $t$ periods without knowing demand function. firm sequentially sets prices earn revenue learn underlying demand function simultaneously. natural heuristic problem, commonly used practice, greedy iterative least squares (gils). time period, gils estimates demand linear function price applying least squares set prior prices realized demands. price maximizes revenue, given estimated demand function, used next time period. performance measured regret, expected revenue loss optimal (oracle) pricing policy demand function known. recently, den boer zwart (2014) keskin zeevi (2014) demonstrated gils sub-optimal. introduced algorithms integrate forced price dispersion gils achieve asymptotically optimal performance. paper, consider dynamic pricing problem data-rich environment. particular, assume firm knows expected demand particular price historical data, period, setting price, firm access extra information (demand covariates) may predictive demand. prove setting gils achieves asymptotically optimal regret order $\log(t)$. also show following surprising result: original dynamic pricing problem den boer zwart (2014) keskin zeevi (2014), inclusion set covariates gils potential demand covariates (even though could carry information) would make gils asymptotically optimal. validate results via extensive numerical simulations synthetic real data sets.",19 "interactive restless multi-armed bandit game swarm intelligence effect. obtain conditions emergence swarm intelligence effect interactive game restless multi-armed bandit (rmab). player competes multiple agents. bandit payoff changes probability $p_{c}$ per round. agents player choose one three options: (1) exploit (a good bandit), (2) innovate (asocial learning good bandit among $n_{i}$ randomly chosen bandits), (3) observe (social learning good bandit). agent two parameters $(c,p_{obs})$ specify decision: (i) $c$, threshold value exploit, (ii) $p_{obs}$, probability observe learning. parameters $(c,p_{obs})$ uniformly distributed. determine optimal strategies player using complete knowledge rmab. show whether social asocial learning optimal $(p_{c},n_{i})$ space define swarm intelligence effect. conduct laboratory experiment (67 subjects) observe swarm intelligence effect $(p_{c},n_{i})$ chosen social learning far optimal asocial learning.",4 "learning-based approach automatic image video colorization. paper, present color transfer algorithm colorize broad range gray images without user intervention. algorithm uses machine learning-based approach automatically colorize grayscale images. algorithm uses superpixel representation reference color images learn relationship different image features corresponding color values. use learned information predict color value grayscale image superpixel. compared processing individual image pixels, use superpixels helps us achieve much higher degree spatial consistency well speeds colorization process. predicted color values gray-scale image superpixels used provide 'micro-scribble' centroid superpixels. color scribbles refined using voting based approach. generate final colorization result, use optimization-based approach smoothly spread color scribble across pixels within superpixel. experimental results broad range images comparison existing state-of-the-art colorization methods demonstrate greater effectiveness proposed algorithm.",4 "instance-level salient object segmentation. image saliency detection recently witnessed rapid progress due deep convolutional neural networks. however, none existing methods able identify object instances detected salient regions. paper, present salient instance segmentation method produces saliency mask distinct object instance labels input image. method consists three steps, estimating saliency map, detecting salient object contours identifying salient object instances. first two steps, propose multiscale saliency refinement network, generates high-quality salient region masks salient object contours. integrated multiscale combinatorial grouping map-based subset optimization framework, method generate promising salient object instance segmentation results. promote research evaluation salient instance segmentation, also construct new database 1000 images pixelwise salient instance annotations. experimental results demonstrate proposed method capable achieving state-of-the-art performance public benchmarks salient region detection well new dataset salient instance segmentation.",4 "associative memories based multiple-valued sparse clustered networks. associative memories structures store data patterns retrieve given partial inputs. sparse clustered networks (scns) recently-introduced binary-weighted associative memories significantly improve storage retrieval capabilities prior state-of-the art. however, deleting updating data patterns result significant increase data retrieval error probability. paper, propose algorithm address problem incorporating multiple-valued weights interconnections used network. proposed algorithm lowers error rate order magnitude sample network 60% deleted contents. investigate advantages proposed algorithm hardware implementations.",4 "robustness semantic segmentation models adversarial attacks. deep neural networks (dnns) demonstrated perform exceptionally well recognition tasks image classification segmentation. however, also shown vulnerable adversarial examples. phenomenon recently attracted lot attention extensively studied multiple, large-scale datasets complex tasks semantic segmentation often require specialised networks additional components crfs, dilated convolutions, skip-connections multiscale processing. paper, present knowledge first rigorous evaluation adversarial attacks modern semantic segmentation models, using two large-scale datasets. analyse effect different network architectures, model capacity multiscale processing, show many observations made task classification always transfer complex task. furthermore, show mean-field inference deep structured models multiscale processing naturally implement recently proposed adversarial defenses. observations aid future efforts understanding defending adversarial examples. moreover, shorter term, show segmentation models currently preferred safety-critical applications due inherent robustness.",4 "pomdp-lite robust robot planning uncertainty. partially observable markov decision process (pomdp) provides principled general model planning uncertainty. however, solving general pomdp computationally intractable worst case. paper introduces pomdp-lite, subclass pomdps hidden state variables constant change deterministically. show pomdp-lite equivalent set fully observable markov decision processes indexed hidden parameter useful modeling variety interesting robotic tasks. develop simple model-based bayesian reinforcement learning algorithm solve pomdp-lite models. algorithm performs well large-scale pomdp-lite models $10^{20}$ states outperforms state-of-the-art general-purpose pomdp algorithms. show algorithm near-bayesian-optimal suitable conditions.",4 "oracle mcg: first peek coco detection challenges. recently presented coco detection challenge probably reference benchmark object detection next years. coco two orders magnitude larger pascal four times number categories; likelihood researchers faced number new challenges. point, without finished round competition, difficult researchers put techniques context, words, know good results are. order give little context, note evaluates hypothetical object detector consisting oracle picking best object proposal state-of-the-art technique. oracle achieves ap=0.292 segmented objects ap=0.317 bounding boxes, showing indeed database challenging, given value best one expect working object proposals without refinement.",4 "short term load forecasting models czech republic using soft computing paradigms. paper presents comparative study six soft computing models namely multilayer perceptron networks, elman recurrent neural network, radial basis function network, hopfield model, fuzzy inference system hybrid fuzzy neural network hourly electricity demand forecast czech republic. soft computing models trained tested using actual hourly load data seven years. comparison proposed techniques presented predicting 2 day ahead demands electricity. simulation results indicate hybrid fuzzy neural network radial basis function networks best candidates analysis forecasting electricity demand.",4 "differentiable submodular maximization. consider learning submodular functions data. functions important machine learning wide range applications, e.g. data summarization, feature selection active learning. despite combinatorial nature, submodular functions maximized approximately strong theoretical guarantees polynomial time. typically, learning submodular function optimization function treated separately, i.e. function first learned using proxy objective subsequently maximized. contrast, show perform learning optimization jointly. interpreting output greedy maximization algorithms distributions sequences items smoothening distributions, obtain differentiable objective. way, differentiate maximization algorithms optimize model work well optimization algorithm. theoretically characterize error made approach, yielding insights trade-off smoothness accuracy. demonstrate effectiveness approach jointly learning optimizing synthetic maxcut data, real world product recommendation application.",19 "shape estimation defocus cue microscopy images via belief propagation. recent years, usefulness 3d shape estimation realized microscopic close-range imaging, 3d information used various applications. due limited depth field small distances, defocus blur induced images provide information 3d shape object. task `shape defocus' (sfd), involves problem estimating good quality 3d shape estimates images depth-dependent defocus blur. research area sfd quite well-established, approaches largely demonstrated results objects bulk/coarse shape variation. however, many cases, objects studied microscopes often involve fine/detailed structures, explicitly considered methods. addition, given that, recent years, large data volumes typically associated microscopy related applications, also important sfd methods efficient. work, provide indication usefulness belief propagation (bp) approach addressing concerns sfd. bp known efficient combinatorial optimization approach, empirically demonstrated yield good quality solutions low-level vision problems image restoration, stereo disparity estimation etc. exploiting efficiency bp sfd, assume local space-invariance defocus blur, enables application bp straightforward manner. even assumption, ability bp provide good quality solutions using non-convex priors, reflects yielding plausible shape estimates presence fine structures objects microscopy imaging.",4 "polyploidy discontinuous heredity effect evolutionary multi-objective optimization. paper examines effect mimicking discontinuous heredity caused carrying one chromosome living organisms cells evolutionary multi-objective optimization algorithms. representation, phenotype may fully reflect genotype. mimicking living organisms inheritance mechanism, traits may silently carried many generations reappear later. representations different number chromosomes solution vector tested different benchmark problems high number decision variables objectives. comparison non-dominated sorting genetic algorithm-ii done problems.",4 "preliminary report structure croatian linguistic co-occurrence networks. article, investigate structure croatian linguistic co-occurrence networks. examine change network structure properties systematically varying co-occurrence window sizes, corpus sizes removing stopwords. co-occurrence window size $n$ establish link current word $n-1$ subsequent words. results point increase co-occurrence window size followed decrease diameter, average path shortening expectedly condensing average clustering coefficient. noticed removal stopwords. finally, since size texts reflected network properties, results suggest corpus influence reduced increasing co-occurrence window size.",4 "boosting presence label noise. boosting known sensitive label noise. studied two approaches improve adaboost's robustness labelling errors. one employ label-noise robust classifier base learner, modify adaboost algorithm robust. empirical evaluation shows committee robust classifiers, although converges faster non label-noise aware adaboost, still susceptible label noise. however, pairing new robust boosting algorithm propose results resilient algorithm mislabelling.",4 "nonparametric nearest neighbor descent clustering based delaunay triangulation. physically inspired in-tree (it) based clustering algorithm series it, one free parameter involved computing potential value point. work, based delaunay triangulation dual voronoi tessellation, propose nonparametric process compute potential values local information. computation, though nonparametric, relatively rough, consequently, many local extreme points generated. however, unlike gradient-based methods, it-based methods generally insensitive local extremes. positively demonstrates superiority parametric (previous) nonparametric (in work) it-based methods.",19 "universal intelligence: definition machine intelligence. fundamental problem artificial intelligence nobody really knows intelligence is. problem especially acute need consider artificial systems significantly different humans. paper approach problem following way: take number well known informal definitions human intelligence given experts, extract essential features. mathematically formalised produce general measure intelligence arbitrary machines. believe equation formally captures concept machine intelligence broadest reasonable sense. show formal definition related theory universal optimal learning agents. finally, survey many tests definitions intelligence proposed machines.",4 "optimizing human-interpretable dialog management policy using genetic algorithm. automatic optimization spoken dialog management policies robust environmental noise long goal academia industry. approaches based reinforcement learning proved effective. however, numerical representation dialog policy human-incomprehensible difficult dialog system designers verify modify, limits practical application. paper propose novel framework optimizing dialog policies specified domain language using genetic algorithm. human-interpretable representation policy makes method suitable practical employment. present learning algorithms using user simulation real human-machine dialogs respectively.empirical experimental results given show effectiveness proposed approach.",4 "cuckoo search: brief literature review. cuckoo search (cs) introduced 2009, attracted great attention due promising efficiency solving many optimization problems real-world applications. last years, many papers published regarding cuckoo search, relevant literature expanded significantly. chapter summarizes briefly majority literature cuckoo search peer-reviewed journals conferences found far. references systematically classified appropriate categories, used basis research.",12 "learning neural markers schizophrenia disorder using recurrent neural networks. smart systems accurately diagnose patients mental disorders identify effective treatments based brain functional imaging data great applicability gaining much attention. previous machine learning studies use hand-designed features, functional connectivity, maintain potential useful information spatial relationship brain regions temporal profile signal region. propose new method based recurrent-convolutional neural networks automatically learn useful representations segments 4-d fmri recordings. goal exploit spatial temporal information functional mri movie (at whole-brain voxel level) identifying patients schizophrenia.",4 "detection tracking general movable objects large 3d maps. paper studies problem detection tracking general objects long-term dynamics, observed mobile robot moving large environment. key problem due environment scale, observe subset objects given time. since time passes observations objects different places, objects might moved robot there. propose model movement objects typically move locally, small probability jump longer distances, call global motion. filtering, decompose posterior local global movements two linked processes. posterior global movements measurement associations sampled, track local movement analytically using kalman filters. novel filter evaluated point cloud data gathered autonomously mobile robot extended period time. show tracking jumping objects feasible, proposed probabilistic treatment outperforms previous methods applied real world data. key efficient probabilistic tracking scenario focused sampling object posteriors.",4 "sparse bilinear logistic regression. paper, introduce concept sparse bilinear logistic regression decision problems involving explanatory variables two-dimensional matrices. problems common computer vision, brain-computer interfaces, style/content factorization, parallel factor analysis. underlying optimization problem bi-convex; study solution develop efficient algorithm based block coordinate descent. provide theoretical guarantee global convergence estimate asymptotical convergence rate using kurdyka-{\l}ojasiewicz inequality. range experiments simulated real data demonstrate sparse bilinear logistic regression outperforms current techniques several important applications.",12 "training fully convolutional neural network route integrated circuits. present deep, fully convolutional neural network learns route circuit layout net appropriate choice metal tracks wire class combinations. inputs network encoded layouts containing spatial location pins routed. 15 fully convolutional stages followed score comparator, network outputs 8 layout layers (corresponding 4 route layers, 3 via layers identity-mapped pin layer) decoded obtain routed layouts. formulate binary segmentation problem per-pixel per-layer basis, network trained correctly classify pixels layout layer 'on' 'off'. demonstrate learnability layout design rules, train network dataset 50,000 train 10,000 validation samples generate based certain pre-defined layout constraints. precision, recall $f_1$ score metrics used track training progress. network achieves $f_1\approx97\%$ train set $f_1\approx92\%$ validation set. use pytorch implementing model. code made publicly available https://github.com/sjain-stanford/deep-route .",4 "analysis design pattern teaching features labels. study task teaching machine classify objects using features labels. introduce error-driven-featuring design pattern teaching using features labels teacher prefers introduce features needed. analyze potential risks benefits teaching pattern use teaching protocols, illustrative examples, providing bounds effort required optimal machine teacher using linear learning algorithm, commonly used type learners interactive machine learning systems. analysis provides deeper understanding potential trade-offs using different learning algorithms effort required featuring (creating new features) labeling (providing labels objects).",4 "multifractal analysis sentence lengths english literary texts. paper presents analysis 30 literary texts written english different authors. text, created time series representing length sentences words analyzed fractal properties using two methods multifractal analysis: mfdfa wtmm. methods showed texts considered multifractal representation majority texts multifractal even fractal all. 30 books, so-correlated lengths consecutive sentences analyzed signals interpreted real multifractals. interesting direction future investigations would identifying specific features cause certain texts multifractal monofractal even fractal all.",15 "empirical evaluation four tensor decomposition algorithms. higher-order tensor decompositions analogous familiar singular value decomposition (svd), transcend limitations matrices (second-order tensors). svd powerful tool achieved impressive results information retrieval, collaborative filtering, computational linguistics, computational vision, fields. however, svd limited two-dimensional arrays data (two modes), many potential applications three modes, require higher-order tensor decompositions. paper evaluates four algorithms higher-order tensor decomposition: higher-order singular value decomposition (ho-svd), higher-order orthogonal iteration (hooi), slice projection (sp), multislice projection (mp). measure time (elapsed run time), space (ram disk space requirements), fit (tensor reconstruction accuracy) four algorithms, variety conditions. find standard implementations ho-svd hooi scale larger tensors, due increasing ram requirements. recommend hooi tensors small enough available ram mp larger tensors.",4 "classifying visualizing motion capture sequences using deep neural networks. gesture recognition using motion capture data depth sensors recently drawn attention vision recognition. currently systems classify dataset couple dozens different actions. moreover, feature extraction data often computational complex. paper, propose novel system recognize actions skeleton data simple, effective, features using deep neural networks. features extracted frame based relative positions joints (po), temporal differences (td), normalized trajectories motion (nt). given features hybrid multi-layer perceptron trained, simultaneously classifies reconstructs input data. use deep autoencoder visualize learnt features, experiments show deep neural networks capture discriminative information than, instance, principal component analysis can. test system public database 65 classes 2,000 motion sequences. obtain accuracy 95% is, knowledge, state art result large dataset.",4 "interpretable classifiers using rules bayesian analysis: building better stroke prediction model. aim produce predictive models accurate, also interpretable human experts. models decision lists, consist series if...then... statements (e.g., high blood pressure, stroke) discretize high-dimensional, multivariate feature space series simple, readily interpretable decision statements. introduce generative model called bayesian rule lists yields posterior distribution possible decision lists. employs novel prior structure encourage sparsity. experiments show bayesian rule lists predictive accuracy par current top algorithms prediction machine learning. method motivated recent developments personalized medicine, used produce highly accurate interpretable medical scoring systems. demonstrate producing alternative chads$_2$ score, actively used clinical practice estimating risk stroke patients atrial fibrillation. model interpretable chads$_2$, accurate.",19 "reasoning uncertainty: monte carlo results. series monte carlo studies performed compare behavior alternative procedures reasoning uncertainty. behavior several bayesian, linear model default reasoning procedures examined context increasing levels calibration error. interesting result bayesian procedures tended output extreme posterior belief values (posterior beliefs near 0.0 1.0) techniques, linear models relatively less likely output strong support erroneous conclusion. also, accounting probabilistic dependencies evidence items important bayesian linear updating procedures.",4 "outer-product hidden markov model polyphonic midi score following. present polyphonic midi score-following algorithm capable following performances arbitrary repeats skips, based probabilistic model musical performances. attractive practical applications score following handle repeats skips may made arbitrarily performances, algorithms previously described literature cannot applied scores practical length due problems large computational complexity. propose new type hidden markov model (hmm) performance model describe arbitrary repeats skips including performer tendencies distributed score positions them, derive efficient score-following algorithm reduces computational complexity without pruning. theoretical discussion much information performer tendencies improves score-following results given. proposed score-following algorithm also admits performance mistakes demonstrated effective practical situations carrying evaluations human performances. proposed hmm potentially valuable topics information processing also provide detailed description inference algorithms.",4 "cognitive architecture direction attention founded subliminal memory searches, pseudorandom nonstop. way explaining brain works logically, human associative memory modeled logical memory neurons, corresponding standard digital circuits. resulting cognitive architecture incorporates basic psychological elements short term long term memory. novel architecture memory searches using cues chosen pseudorandomly short term memory. recalls alternated sensory images, many tens per second, analyzed subliminally ongoing process, determine direction attention short term memory.",4 "3d shape estimation 2d landmarks: convex relaxation approach. investigate problem estimating 3d shape object, given set 2d landmarks single image. alleviate reconstruction ambiguity, widely-used approach confine unknown 3d shape within shape space built upon existing shapes. approach proven successful various applications, challenging issue remains, i.e., joint estimation shape parameters camera-pose parameters requires solve nonconvex optimization problem. existing methods often adopt alternating minimization scheme locally update parameters, consequently solution sensitive initialization. paper, propose convex formulation address problem develop efficient algorithm solve proposed convex program. demonstrate exact recovery property proposed method, merits compared alternative methods, applicability human pose car shape estimation.",4 "detecting overlapping temporal community structure time-evolving networks. present principled approach detecting overlapping temporal community structure dynamic networks. method based following framework: find overlapping temporal community structure maximizes quality function associated snapshot network subject temporal smoothness constraint. novel quality function smoothness constraint proposed handle overlaps, new convex relaxation used solve resulting combinatorial optimization problem. provide theoretical guarantees well experimental results reveal community structure real synthetic networks. main insight certain structures identified temporal correlation considered communities allowed overlap. general, discovering overlapping temporal community structure enhance understanding real-world complex networks revealing underlying stability behind seemingly chaotic evolution.",4 "anmm: ranking short answer texts attention-based neural matching model. alternative question answering methods based feature engineering, deep learning approaches convolutional neural networks (cnns) long short-term memory models (lstms) recently proposed semantic matching questions answers. achieve good results, however, models combined additional features word overlap bm25 scores. without combination, models perform significantly worse methods based linguistic feature engineering. paper, propose attention based neural matching model ranking short answer text. adopt value-shared weighting scheme instead position-shared weighting scheme combining different matching signals incorporate question term importance learning using question attention network. using popular benchmark trec qa data, show relatively simple anmm model significantly outperform neural network models used question answering task, competitive models combined additional features. anmm combined additional features, outperforms baselines.",4 "learning hierarchical latent-variable model 3d shapes. propose variational shape learner (vsl), hierarchical latent-variable model 3d shape learning. vsl employs unsupervised approach learning inferring underlying structure voxelized 3d shapes. use skip-connections, model successfully learn latent, hierarchical representation objects. furthermore, realistic 3d objects easily generated sampling vsl's latent probabilistic manifold. show generative model trained end-to-end 2d images perform single image 3d model retrieval. experiments show, quantitatively qualitatively, improved performance proposed model range tasks.",4 "meta-qsar: large-scale application meta-learning drug design discovery. investigate learning quantitative structure activity relationships (qsars) case-study meta-learning. application area highest societal importance, key step development new medicines. standard qsar learning problem is: given target (usually protein) set chemical compounds (small molecules) associated bioactivities (e.g. inhibition target), learn predictive mapping molecular representation activity. although almost every type machine learning method applied qsar learning agreed single best way learning qsars, therefore problem area well-suited meta-learning. first carried comprehensive ever comparison machine learning methods qsar learning: 18 regression methods, 6 molecular representations, applied 2,700 qsar problems. (these results made publicly available openml represent valuable resource testing novel meta-learning methods.) investigated utility algorithm selection qsar problems. found meta-learning approach outperformed best individual qsar learning method (random forests using molecular fingerprint representation) 13%, average. conclude meta-learning outperforms base-learning methods qsar learning, investigation one extensive ever comparisons base meta-learning methods ever made, provides evidence general effectiveness meta-learning base-learning.",4 "new learning paradigm random vector functional-link network: rvfl+. school, teacher plays important role various classroom teaching patterns. likewise human learning activity, learning using privileged information (lupi) paradigm provides additional information generated teacher 'teach' learning algorithms training stage. therefore, novel learning paradigm typical teacher-student interaction mechanism. paper first present random vector functional link network based lupi paradigm, called rvfl+. rather simply combining two existing approaches, newly-derived rvfl+ fills gap neural networks lupi paradigm, offers alternative way train rvfl networks. moreover, proposed rvfl+ perform conjunction kernel trick highly complicated nonlinear feature learning, termed krvfl+. furthermore, statistical property proposed rvfl+ investigated, derive sharp high-quality generalization error bound based rademacher complexity. competitive experimental results 14 real-world datasets illustrate great effectiveness efficiency novel rvfl+ krvfl+, achieve better generalization performance state-of-the-art algorithms.",19 "behavior path planning coalition cognitive robots smart relocation tasks. paper outline approach solving special type navigation tasks robotic systems, coalition robots (agents) acts 2d environment, modified actions, share goal location. latter originally unreachable members coalition, common task still accomplished agents assist (e.g. modifying environment). call tasks smart relocation tasks (as solved pure path planning methods) study spatial behavior interaction robots solving them. use cognitive approach introduce semiotic knowledge representation - sign world model underlines behavioral planning methodology. planning viewed recursive search process hierarchical state-space induced sings path planning signs reside lowest level. reaching level triggers path planning accomplished state art grid-based planners focused producing smooth paths (e.g. lian) thus indirectly guarantying feasibility paths agent's dynamic constraints.",4 "bayesian filtering odes bounded derivatives. recently increasing interest probabilistic solvers ordinary differential equations (odes) return full probability measures, instead point estimates, solution incorporate uncertainty ode hand, e.g. vector field initial value approximately known evaluable. ode filter proposed recent work models solution ode gauss-markov process serves prior sense bayesian statistics. previous work employed wiener process prior (possibly multiple times) differentiated solution ode established equivalence corresponding solver classical numerical methods, paper raises question whether priors also yield practically useful solvers. end, discuss range possible priors enable fast filtering propose new prior--the integrated ornstein uhlenbeck process (ioup)--that complements existing integrated wiener process (iwp) filter encoding property derivative time solution bounded sense tends drift back zero. provide experiments comparing iwp ioup filters support belief iwp approximates better divergent ode's solutions whereas ioup better prior trajectories bounded derivatives.",4 "safedrive: robust lane tracking system autonomous assisted driving limited visibility. present approach towards robust lane tracking assisted autonomous driving, particularly poor visibility. autonomous detection lane markers improves road safety, purely visual tracking desirable widespread vehicle compatibility reducing sensor intrusion, cost, energy consumption. however, visual approaches often ineffective number factors, including limited occlusion, poor weather conditions, paint wear-off. method, named safedrive, attempts improve visual lane detection approaches drastically degraded visual conditions without relying additional active sensors. scenarios visual lane detection algorithms unable detect lane markers, proposed approach uses location information vehicle locate access alternate imagery road attempts detection secondary image. subsequently, using combination feature-based pixel-based alignment, estimated location lane marker found current scene. demonstrate effectiveness system actual driving data locations united states google street view source alternate imagery.",4 "stance classification rumours sequential task exploiting tree structure social media conversations. rumour stance classification, task determines tweet collection discussing rumour supporting, denying, questioning simply commenting rumour, attracting substantial interest. introduce novel approach makes use sequence transitions observed tree-structured conversation threads twitter. conversation threads formed harvesting users' replies one another, results nested tree-like structure. previous work addressing stance classification task treated tweet separate unit. analyse tweets virtue position sequence test two sequential classifiers, linear-chain crf tree crf, makes different assumptions conversational structure. experiment eight twitter datasets, collected breaking news, show exploiting sequential structure twitter conversations achieves significant improvements non-sequential methods. work first model twitter conversations tree structure manner, introducing novel way tackling nlp tasks twitter conversations.",4 "learning conditional independence structure high-dimensional uncorrelated vector processes. formulate analyze graphical model selection method inferring conditional independence graph high-dimensional nonstationary gaussian random process (time series) finite-length observation. observed process samples assumed uncorrelated time time-varying marginal distribution. selection method based testing conditional variances obtained small subsets process components. allows cope high-dimensional regime, sample size (drastically) smaller process dimension. characterize required sample size proposed selection method successful high probability.",19 "optimal sparse linear auto-encoders sparse pca. principal components analysis (pca) optimal linear auto-encoder data, often used construct features. enforcing sparsity principal components promote better generalization, improving interpretability features. study problem constructing optimal sparse linear auto-encoders. two natural questions setting are: i) given level sparsity, best approximation pca achieved? ii) low-order polynomial-time algorithms asymptotically achieve optimal tradeoff sparsity approximation quality? work, answer questions giving efficient low-order polynomial-time algorithms constructing asymptotically \emph{optimal} linear auto-encoders (in particular, sparse features near-pca reconstruction error) demonstrate performance algorithms real data.",4 "transfer learning, soft distance-based bias, hierarchical boa. automated technique recently proposed transfer learning hierarchical bayesian optimization algorithm (hboa) based distance-based statistics. technique enables practitioners improve hboa efficiency collecting statistics probabilistic models obtained previous hboa runs using obtained statistics bias future hboa runs similar problems. purpose paper threefold: (1) test technique several classes np-complete problems, including maxsat, spin glasses minimum vertex cover; (2) demonstrate technique effective even previous runs done problems different size; (3) provide empirical evidence combining transfer learning efficiency enhancement techniques often yield nearly multiplicative speedups.",4 "weight initialization deep neural networks(dnns) using data statistics. deep neural networks (dnns) form backbone almost every state-of-the-art technique fields computer vision, speech processing, text analysis. recent advances computational technology made use dnns practical. despite overwhelming performances dnn advances computational technology, seen researchers try train models scratch. training dnns still remains difficult tedious job. main challenges researchers face training dnns vanishing/exploding gradient problem highly non-convex nature objective function million variables. approaches suggested xavier solve vanishing gradient problem providing sophisticated initialization technique. approaches quite effective achieved good results standard datasets, approaches work well practical datasets. think reason making use data statistics initializing network weights. optimizing high dimensional loss function requires careful initialization network weights. work, propose data dependent initialization analyze performance standard initialization techniques xavier. performed experiments practical datasets results show algorithm's superior classification accuracy.",4 "logical stochastic optimization. present logical framework represent reason stochastic optimization problems based probability answer set programming. established allowing probability optimization aggregates, e.g., minimum maximum language probability answer set programming allow minimization maximization desired criteria probabilistic environments. show application proposed logical stochastic optimization framework probability answer set programming two stages stochastic optimization problems recourse.",4 "inferring disease gene set associations rank coherence networks. computational challenge validate candidate disease genes identified high-throughput genomic study elucidate associations set candidate genes disease phenotypes. conventional gene set enrichment analysis often fails reveal associations disease phenotypes gene sets short list poorly annotated genes, existing annotations disease causative genes incomplete. propose network-based computational approach called rcnet discover associations gene sets disease phenotypes. assuming coherent associations genes ranked relevance query gene set, disease phenotypes ranked relevance hidden target disease phenotypes query gene set, formulate learning framework maximizing rank coherence respect known disease phenotype-gene associations. efficient algorithm coupling ridge regression label propagation, two variants introduced find optimal solution framework. evaluated rcnet algorithms existing baseline methods leave-one-out cross-validation task predicting recently discovered disease-gene associations omim. experiments demonstrated rcnet algorithms achieved best overall rankings compared baselines. validate reproducibility performance, applied algorithms identify target diseases novel candidate disease genes obtained recent studies gwas, dna copy number variation analysis, gene expression profiling. algorithms ranked target disease candidate genes top rank list many cases across three case studies. rcnet algorithms available webtool disease gene set association analysis http://compbio.cs.umn.edu/dgsa_rcnet.",16 "statistical analysis loopy belief propagation random fields. loopy belief propagation (lbp), equivalent bethe approximation statistical mechanics, message-passing-type inference method widely used analyze systems based markov random fields (mrfs). paper, propose message-passing-type method analytically evaluate quenched average lbp random fields using replica cluster variation method. proposed analytical method applicable general pair-wise mrfs random fields whose distributions differ give quenched averages bethe free energies random fields, consistent numerical results. order computational cost equivalent standard lbp. latter part paper, describe application proposed method bayesian image restoration, observed theoretical results good agreement numerical results natural images.",19 "second croatian computer vision workshop (ccvw 2013). proceedings second croatian computer vision workshop (ccvw 2013, http://www.fer.unizg.hr/crv/ccvw2013) held september 19, 2013, zagreb, croatia. workshop organized center excellence computer vision university zagreb.",4 "morphologic knowledge dynamics: revision, fusion, abduction. several tasks artificial intelligence require able find models knowledge dynamics. include belief revision, fusion belief merging, abduction. paper exploit algebraic framework mathematical morphology context propositional logic, define operations dilation erosion set formulas. derive concrete operators, based semantic approach, intuitive interpretation formally well behaved, perform revision, fusion abduction. computation tractability addressed, simple examples illustrate typical results obtained.",4 "prediction-adaptation-correction recurrent neural networks low-resource language speech recognition. paper, investigate use prediction-adaptation-correction recurrent neural networks (pac-rnns) low-resource speech recognition. pac-rnn comprised pair neural networks {\it correction} network uses auxiliary information given {\it prediction} network help estimate state probability. information correction network also used prediction network recurrent loop. model outperforms state-of-the-art neural networks (dnns, lstms) iarpa-babel tasks. moreover, transfer learning language similar target language help improve performance further.",4 "understanding convolutional networks apple : automatic patch pattern labeling explanation. success deep learning, recent efforts focused analyzing learned networks make classifications. interested analyzing network output based network structure information flow network layers. contribute algorithm 1) analyzing deep network find neurons 'important' terms network classification outcome, 2)automatically labeling patches input image activate important neurons. propose several measures importance neurons demonstrate technique used gain insight into, explain network decomposes image make final classification.",4 "representing reasoning probabilistic knowledge: bayesian approach. pagoda (probabilistic autonomous goal-directed agent) model autonomous learning probabilistic domains [desjardins, 1992] incorporates innovative techniques using agent's existing knowledge guide constrain learning process representing, reasoning with, learning probabilistic knowledge. paper describes probabilistic representation inference mechanism used pagoda. pagoda forms theories effects actions world state environment time. theories represented conditional probability distributions. restriction imposed structure theories allows inference mechanism find unique predicted distribution action world state description. restricted theories called uniquely predictive theories. inference mechanism, probability combination using independence (pci), uses minimal independence assumptions combine probabilities theory make probabilistic predictions.",4 "alternative gospel structure: order, composition, processes. survey basic mathematical structures, arguably primitive structures taught school. structures orders, without composition, (symmetric) monoidal categories. list several `real life' incarnations these. paper also serves introduction structures current potentially future uses linguistics, physics knowledge representation.",12 "model-free episodic control. state art deep reinforcement learning algorithms take many millions interactions attain human-level performance. humans, hand, quickly exploit highly rewarding nuances environment upon first discovery. brain, rapid learning thought depend hippocampus capacity episodic memory. investigate whether simple model hippocampal episodic control learn solve difficult sequential decision-making tasks. demonstrate attains highly rewarding strategy significantly faster state-of-the-art deep reinforcement learning algorithms, also achieves higher overall reward challenging domains.",19 "learning classify possible sensor failures. paper, propose general framework learn robust large-margin binary classifier corrupt measurements, called anomalies, caused sensor failure might present training set. goal minimize generalization error classifier non-corrupted measurements controlling false alarm rate associated anomalous samples. incorporating non-parametric regularizer based empirical entropy estimator, propose geometric-entropy-minimization regularized maximum entropy discrimination (gem-med) method learn classify detect anomalies joint manner. demonstrate using simulated data real multimodal data set. gem-med method yield improved performance previous robust classification methods terms classification accuracy anomaly detection rate.",4 "large margin image set representation classification. paper, propose novel image set representation classification method maximizing margin image sets. margin image set defined difference distance nearest image set different classes distance nearest image set class. modeling image sets using image samples affine hull models, maximizing margins images sets, image set representation parameter learning problem formulated minimization problem, optimized expectation -maximization (em) strategy accelerated proximal gradient (apg) optimization iterative algorithm. classify given test image set, assign class could provide largest margin. experiments two applications video-sequence-based face recognition demonstrate proposed method significantly outperforms state-of-the-art image set classification methods terms effectiveness efficiency.",4 "belief merging source reliability assessment. merging beliefs requires plausibility sources information merged. typically assumed equally reliable lack hints indicating otherwise; yet, recent line research spun idea deriving information revision process itself. particular, history previous revisions previous merging examples provide information performing subsequent mergings. yet, examples previous revisions may available. spite apparent lack information, something still inferred try-and-check approach: relative reliability ordering assumed, merging process performed based it, result compared original information. outcome check may incoherent initial assumption, like completely reliable source rejected information provided. cases, reliability ordering assumed first place excluded consideration. first theorem article proves scenario indeed possible. results obtained various definition reliability merging.",4 "detailed, accurate, human shape estimation clothed 3d scan sequences. address problem estimating human pose body shape 3d scans time. reliable estimation 3d body shape necessary many applications including virtual try-on, health monitoring, avatar creation virtual reality. scanning bodies minimal clothing, however, presents practical barrier applications. address problem estimating body shape clothing sequence 3d scans. previous methods exploited body models produce smooth shapes lacking personalized details. contribute new approach recover personalized shape person. estimated shape deviates parametric model fit 3d scans. demonstrate method using high quality 4d data well sequences visual hulls extracted multi-view images. also make available buff, new 4d dataset enables quantitative evaluation (http://buff.is.tue.mpg.de). method outperforms state art pose estimation shape estimation, qualitatively quantitatively.",4 "efficient methods unsupervised learning probabilistic models. thesis develop variety techniques train, evaluate, sample intractable high dimensional probabilistic models. abstract exceeds arxiv space limitations -- see pdf.",4 "dictionary based approach edge detection. edge detection essential part image processing, quality accuracy detection determines success processing. developed new self learning technique edge detection using dictionary comprised eigenfilters constructed using features input image. dictionary based method eliminates need pre post processing image accounts noise, blurriness, class image variation illumination detection process itself. since, method depends characteristics image, new technique detect edges accurately capture greater detail existing algorithms sobel, prewitt laplacian gaussian, canny method etc use generic filters operators. demonstrated application various classes images text, face, barcodes, traffic cell images. application technique cell counting microscopic image also presented.",4 "hilbert space embeddings pomdps. nonparametric approach policy learning pomdps proposed. approach represents distributions states, observations, actions embeddings feature spaces, reproducing kernel hilbert spaces. distributions states given observations obtained applying kernel bayes' rule distribution embeddings. policies value functions defined feature space states, leads feature space expression bellman equation. value iteration may used estimate optimal value function associated policy. experimental results confirm correct policy learned using feature space representation.",4 "theoretical framework robustness (deep) classifiers adversarial examples. machine learning classifiers, including deep neural networks, vulnerable adversarial examples. inputs typically generated adding small purposeful modifications lead incorrect outputs imperceptible human eyes. goal paper introduce single method, make theoretical steps towards fully understanding adversarial examples. using concepts topology, theoretical analysis brings forth key reasons adversarial example fool classifier ($f_1$) adds oracle ($f_2$, like human eyes) analysis. investigating topological relationship two (pseudo)metric spaces corresponding predictor $f_1$ oracle $f_2$, develop necessary sufficient conditions determine $f_1$ always robust (strong-robust) adversarial examples according $f_2$. interestingly theorems indicate one unnecessary feature make $f_1$ strong-robust, right feature representation learning key getting classifier accurate strong-robust.",4 "deep learning physical processes: incorporating prior scientific knowledge. consider use deep learning methods modeling complex phenomena like occurring natural physical processes. large amount data gathered phenomena data intensive paradigm could begin challenge traditional approaches elaborated years fields like maths physics. however, despite considerable successes variety application domains, machine learning field yet ready handle level complexity required problems. using example application, namely sea surface temperature prediction, show general background knowledge gained physics could used guideline designing efficient deep learning models. order motivate approach assess generality demonstrate formal link solution class differential equations underlying large family physical phenomena proposed model. experiments comparison series baselines including state art numerical approach provided.",4 production system rules protein complexes genetic regulatory networks. short paper introduces new way design production system rules. indirect encoding scheme presented views rules protein complexes produced temporal behaviour artificial genetic regulatory network. initial study begins using simple boolean regulatory network produce traditional ternary-encoded rules moving fuzzy variant produce real-valued rules. competitive performance shown related genetic regulatory networks rule-based systems benchmark problems.,4 "unsupervised iterative deep learning speech features acoustic tokens applications spoken term detection. paper aim automatically discover high quality frame-level speech features acoustic tokens directly unlabeled speech data. multi-granular acoustic tokenizer (mat) proposed automatic discovery multiple sets acoustic tokens given corpus. acoustic token set specified set hyperparameters describing model configuration. different sets acoustic tokens carry different characteristics given corpus language behind, thus mutually reinforced. multiple sets token labels used targets multi-target deep neural network (mdnn) trained frame-level acoustic features. bottleneck features extracted mdnn used feedback input mat mdnn next iteration. multi-granular acoustic token sets frame-level speech features iteratively optimized iterative deep learning framework. call framework multi-granular acoustic tokenizing deep neural network (matdnn). results evaluated using metrics corpora defined zero resource speech challenge organized interspeech 2015, improved performance obtained set experiments query-by-example spoken term detection corpora. visualization discovered tokens english phonemes also shown.",4 "underwater multi-robot convoying using visual tracking detection. present robust multi-robot convoying approach relies visual detection leading agent, thus enabling target following unstructured 3-d environments. method based idea tracking-by-detection, interleaves efficient model-based object detection temporal filtering image-based bounding box estimation. approach important advantage mitigating tracking drift (i.e. drifting away target object), common symptom model-free trackers detrimental sustained convoying practice. illustrate solution, collected extensive footage underwater robot ocean settings, hand-annotated location frame. based dataset, present empirical comparison multiple tracker variants, including use several convolutional neural networks, without recurrent connections, well frequency-based model-free trackers. also demonstrate practicality tracking-by-detection strategy real-world scenarios successfully controlling legged underwater robot five degrees freedom follow another robot's independent motion.",4 "dual approach scalable verification deep networks. paper addresses problem formally verifying desirable properties neural networks, i.e., obtaining provable guarantees outputs neural network always behave certain way given class inputs. previous work topic limited applicability size network, network architecture complexity properties verified. contrast, framework applies much general class activation functions specifications neural network inputs outputs. formulate verification optimization problem solve lagrangian relaxation optimization problem obtain upper bound verification objective. approach anytime, i.e. stopped time valid bound objective obtained. develop specialized verification algorithms provable tightness guarantees special assumptions demonstrate practical significance general verification approach variety verification tasks.",4 "modeep: deep learning framework using motion features human pose estimation. work, propose novel efficient method articulated human pose estimation videos using convolutional network architecture, incorporates color motion features. propose new human body pose dataset, flic-motion, extends flic dataset additional motion features. apply architecture dataset report significantly better performance current state-of-the-art pose detection systems.",4 "detecting multiword phrases mathematical text corpora. present approach detecting multiword phrases mathematical text corpora. method used based characteristic features mathematical terminology. makes use software tool named lingo allows identify words means previously defined dictionaries specific word classes adjectives, personal names nouns. detection multiword groups done algorithmically. possible advantages method indexing information retrieval conclusions applying dictionary-based methods automatic indexing instead stemming procedures discussed.",4 "uncovering latent style factors expressive speech synthesis. prosodic modeling core problem speech synthesis. key challenge producing desirable prosody textual input containing phonetic information. preliminary study, introduce concept ""style tokens"" tacotron, recently proposed end-to-end neural speech synthesis model. using style tokens, aim extract independent prosodic styles training data. show without annotation data explicit supervision signal, approach automatically learn variety prosodic variations purely data-driven way. importantly, style token corresponds fixed style factor regardless given text sequence. result, control prosodic style synthetic speech somewhat predictable globally consistent way.",4 "multi-document summarization via discriminative summary reranking. existing multi-document summarization systems usually rely specific summarization model (i.e., summarization method specific parameter setting) extract summaries different document sets different topics. however, according quantitative analysis, none existing summarization models always produce high-quality summaries different document sets, even summarization model good overall performance may produce low-quality summaries document sets. contrary, baseline summarization model may produce high-quality summaries document sets. based observations, treat summaries produced different summarization models candidate summaries, explore discriminative reranking techniques identify high-quality summaries candidates difference document sets. propose extract set candidate summaries document set based ilp framework, leverage ranking svm summary reranking. various useful features developed reranking process, including word-level features, sentence-level features summary-level features. evaluation results benchmark duc datasets validate efficacy robustness proposed approach.",4 "understanding physics interconnected data. metal melting release explosion physical system far quilibrium. complete physical model system exist, many interrelated effects considered. general methodology needs developed describe understand physical phenomena involved. high noise data, moving blur images, high degree uncertainty due different types sensors, information entangled hidden inside noisy images makes reasoning physical processes difficult. major problems include proper information extraction problem reconstruction, well prediction missing data. paper, several techniques addressing first problem given, building basis tackling second problem.",4 "robust face recognition via block sparse bayesian learning. face recognition (fr) important task pattern recognition computer vision. sparse representation (sr) demonstrated powerful framework fr. general, sr algorithm treats face training dataset basis function, tries find sparse representation test face basis functions. sparse representation coefficients provide recognition hint. early sr algorithms based basic sparse model. recently, found algorithms based block sparse model achieve better recognition rates. based model, study use block sparse bayesian learning (bsbl) find sparse representation test face recognition. bsbl recently proposed framework, many advantages existing block-sparse-model based algorithms. experimental results extended yale b, ar cmu pie face databases show using bsbl achieve better recognition rates higher robustness state-of-the-art algorithms cases.",4 "helping domain experts build speech translation systems. present new platform, ""regulus lite"", supports rapid development web deployment several types phrasal speech translation systems using minimal formalism. distinguishing feature development work performed directly domain experts. motivate need platforms type discuss three specific cases: medical speech translation, speech-to-sign-language translation voice questionnaires. briefly describe initial experiences developing practical systems.",4 "attention-based guided structured sparsity deep neural networks. network pruning aimed imposing sparsity neural network architecture increasing portion zero-valued weights reducing size regarding energy-efficiency consideration increasing evaluation speed. conducted research efforts, sparsity enforced network pruning without attention internal network characteristics unbalanced outputs neurons specifically distribution weights outputs neurons. may cause severe accuracy drop due uncontrolled sparsity. work, propose attention mechanism simultaneously controls sparsity intensity supervised network pruning keeping important information bottlenecks network active. cifar-10, proposed method outperforms best baseline method 6% reduced accuracy drop 2.6x level sparsity.",4 "generalized linear mixing model accounting endmember variability. endmember variability important factor accurately unveiling vital information relating pure materials distribution hyperspectral images. recently, extended linear mixing model (elmm) proposed modification linear mixing model (lmm) consider endmember variability effects resulting mainly illumination changes. paper, generalize elmm leading new model (glmm) account complex spectral distortions different wavelength intervals affected unevenly. also extend existing methodology jointly estimate variability abundances glmm. simulations real synthetic data show unmixing process benefit extra flexibility introduced glmm.",4 "categorization axioms clustering results. cluster analysis attracted attention field machine learning data mining. numerous clustering algorithms proposed developed due diverse theories various requirements emerging applications. therefore, worth establishing unified axiomatic framework data clustering. literature, open problem proved challenging. paper, clustering results axiomatized assuming proper clustering result satisfy categorization axioms. proposed axioms introduce classification clustering results inequalities clustering results, also consistent prototype theory exemplar theory categorization models cognitive science. moreover, proposed axioms lead three principles designing clustering algorithm cluster validity index, follow many popular clustering algorithms cluster validity indices.",4 "multi-channel weighted nuclear norm minimization real color image denoising. existing denoising algorithms developed grayscale images, trivial work extend color image denoising noise statistics r, g, b channels different real noisy images. paper, propose multi-channel (mc) optimization model real color image denoising weighted nuclear norm minimization (wnnm) framework. concatenate rgb patches make use channel redundancy, introduce weight matrix balance data fidelity three channels consideration different noise statistics. proposed mc-wnnm model analytical solution. reformulate linear equality-constrained problem solve alternating direction method multipliers. alternative updating step closed-form solution convergence guaranteed. extensive experiments synthetic real noisy image datasets demonstrate superiority proposed mc-wnnm state-of-the-art denoising methods.",4 "acquisition visual features probabilistic spike-timing-dependent plasticity. final version paper published ieeexplore available http://ieeexplore.ieee.org/document/7727213. please cite paper as: amirhossein tavanaei, timothee masquelier, anthony maida, acquisition visual features probabilistic spike-timing-dependent plasticity. ieee international joint conference neural networks. pp. 307-314, ijcnn 2016. paper explores modifications feedforward five-layer spiking convolutional network (scn) ventral visual stream [masquelier, t., thorpe, s., unsupervised learning visual features spike timing dependent plasticity. plos computational biology, 3(2), 247-257]. original model showed spike-timing-dependent plasticity (stdp) learning algorithm embedded appropriately selected scn could perform unsupervised feature discovery. discovered features interpretable could effectively used perform rapid binary decisions classifier. order study robustness previous results, present research examines effects modifying components original model. improved biological realism, replace original non-leaky integrate-and-fire neurons izhikevich-like neurons. also replace original stdp rule novel rule probabilistic interpretation. probabilistic stdp slightly significantly improves performance types model neurons. use izhikevich-like neuron found improve performance although performance still comparable neuron. shows model robust enough handle biologically realistic neurons. also conclude underlying reasons stable performance model preserved despite overt changes explicit components model.",4 "equations states singular statistical estimation. learning machines hierarchical structures hidden variables singular statistical models nonidentifiable fisher information matrices singular. singular statistical models, neither bayes posteriori distribution converges normal distribution maximum likelihood estimator satisfies asymptotic normality. main reason difficult predict generalization performances trained states. paper, study four errors, (1) bayes generalization error, (2) bayes training error, (3) gibbs generalization error, (4) gibbs training error, prove mathematical relations among errors. formulas proved paper equations states statistical estimation hold true distribution, parametric model, priori distribution. also show bayes gibbs generalization errors estimated bayes gibbs training errors, propose widely applicable information criteria applied regular singular statistical models.",4 "parallel multi channel convolution using general matrix multiplication. convolutional neural networks (cnns) emerged one successful machine learning technologies image video processing. computationally intensive parts cnns convolutional layers, convolve multi-channel images multiple kernels. common approach implementing convolutional layers expand image column matrix (im2col) perform multiple channel multiple kernel (mcmk) convolution using existing parallel general matrix multiplication (gemm) library. im2col conversion greatly increases memory footprint input matrix reduces data locality. paper propose new approach mcmk convolution based general matrix multiplication (gemm), im2col. algorithm eliminates need data replication input thereby enabling us apply convolution kernels input images directly. implemented several variants algorithm cpu processor embedded arm processor. cpu, algorithm faster im2col cases.",4 "machine comprehension based learning rank. machine comprehension plays essential role nlp widely explored dataset like mctest. however, dataset simple small learning true reasoning abilities. \cite{hermann2015teaching} therefore release large scale news article dataset propose deep lstm reader system machine comprehension. however, training process expensive. therefore try feature-engineered approach semantics new dataset see traditional machine learning technique semantics help machine comprehension. meanwhile, proposed l2r reader system achieves good performance efficiency less training data.",4 effective sparse representation x-ray medical images. effective sparse representation x-ray medical images within context data reduction considered. proposed framework shown render enormous reduction cardinality data set required represent class images good quality. particularity approach implemented competitive processing time low memory requirements,4 "deepvisage: making face recognition simple yet powerful generalization skills. face recognition (fr) methods report significant performance adopting convolutional neural network (cnn) based learning methods. although cnns mostly trained optimizing softmax loss, recent trend shows improvement accuracy different strategies, task-specific cnn learning different loss functions, fine-tuning target dataset, metric learning concatenating features multiple cnns. incorporating tasks obviously requires additional efforts. moreover, demotivates discovery efficient cnn models fr trained identity labels. focus fact propose easily trainable single cnn based fr method. cnn model exploits residual learning framework. additionally, uses normalized features compute loss. extensive experiments show excellent generalization different datasets. obtain competitive state-of-the-art results lfw, ijb-a, youtube faces cacd datasets.",4 "tomographic reconstruction using global statistical prior. recent research tomographic reconstruction motivated need efficiently recover detailed anatomy limited measurements. one ways compensate increasingly sparse sets measurements exploit information templates, i.e., prior data available form already reconstructed, structurally similar images. towards this, previous work exploited using set global patch based dictionary priors. paper, propose global prior improve speed quality tomographic reconstruction within compressive sensing framework. choose set potential representative 2d images referred templates, build eigenspace; subsequently used guide iterative reconstruction similar slice sparse acquisition data. experiments across diverse range datasets show reconstruction using appropriate global prior, apart faster, gives much lower reconstruction error compared state art.",4 "happy travelers take big pictures: psychological study machine learning big data. psychology, theory-driven researches usually conducted extensive laboratory experiments, yet rarely tested disproved big data. paper, make use 418k travel photos traveler ratings test influential ""broaden-and-build"" theory, suggests positive emotions broaden one's visual attention. core hypothesis examined study positive emotion associated wider attention, hence highly-rated sites would trigger wide-angle photographs. analyzing travel photos, find strong correlation preference wide-angle photos high rating tourist sites tripadvisor. able carry analysis use deep learning algorithms classify photos wide narrow angles, present study exemplar big data deep learning used test laboratory findings wild.",4 "speeding sat solver exploring cnf symmetries : revisited. boolean satisfiability solvers gone dramatic improvements performances scalability last years considering symmetries. shown using graph symmetries generating symmetry breaking predicates (sbps) possible break symmetries conjunctive normal form (cnf). sbps cut search space nonsymmetric regions space without affecting satisfiability cnf formula. symmetry breaking predicates created representing formula graph, finding graph symmetries using symmetry extraction mechanism (crawford et al.). paper take one non-trivial cnf explore symmetries. finally, generate sbps adding cnf show helps prune search tree, sat solver would take short time. present pruning procedure search tree scratch, starting cnf graph representation. explore whole mechanism non-trivial example, would easily comprehendible. also given new idea generating symmetry breaking predicates breaking symmetry cnf, derived crawford's conditions. last propose backtrack sat solver inbuilt sbp generator.",12 "deep generative filter motion deblurring. removing blur caused camera shake images always challenging problem computer vision literature due ill-posed nature. motion blur caused due relative motion camera object 3d space induces spatially varying blurring effect entire image. paper, propose novel deep filter based generative adversarial network (gan) architecture integrated global skip connection dense architecture order tackle problem. model, bypassing process blur kernel estimation, significantly reduces test time necessary practical applications. experiments benchmark datasets prove effectiveness proposed method outperforms state-of-the-art blind deblurring algorithms quantitatively qualitatively.",4 "cost-based feature transfer vehicle occupant classification. knowledge human presence interaction vehicle growing interest vehicle manufacturers design safety purposes. present framework perform tasks occupant detection occupant classification automatic child locks airbag suppression. operates passenger seats, using single overhead camera. transfer learning technique introduced make full use training data seats whilst still maintaining control bias, necessary system designed penalize certain misclassifications others. evaluation performed challenging dataset weighted unweighted classifiers, demonstrating effectiveness transfer process.",4 "near optimal behavior via approximate state abstraction. combinatorial explosion plagues planning reinforcement learning (rl) algorithms moderated using state abstraction. prohibitively large task representations condensed essential information preserved, consequently, solutions tractably computable. however, exact abstractions, treat fully-identical situations equivalent, fail present opportunities abstraction environments two situations exactly alike. work, investigate approximate state abstractions, treat nearly-identical situations equivalent. present theoretical guarantees quality behaviors derived four types approximate abstractions. additionally, empirically demonstrate approximate abstractions lead reduction task complexity bounded loss optimality behavior variety environments.",4 "mining generalized graph patterns based user examples. lot recent interest mining patterns graphs. often, exact structure patterns interest known. happens, example, molecular structures mined discover fragments useful features chemical compound classification task, web sites mined discover sets web pages representing logical documents. patterns often generated small subgraphs (cores), according certain generalization rules (grs). call patterns ""generalized patterns""(gps). structurally different, gps often perform function network. previously proposed approaches mining gps either assumed cores grs given, interesting gps frequent. strong assumptions, often hold practical applications. paper, propose approach mining gps free assumptions. given small number gps selected user, algorithm discovers gps similar user examples. first, machine learning-style approach used find cores. second, generalizations cores graph computed identify gps. evaluation synthetic data, generated using real cores grs biological web domains, demonstrates effectiveness approach.",4 "high dimensional semiparametric gaussian copula graphical models. paper, propose semiparametric approach, named nonparanormal skeptic, efficiently robustly estimating high dimensional undirected graphical models. achieve modeling flexibility, consider gaussian copula graphical models (or nonparanormal) proposed liu et al. (2009). achieve estimation robustness, exploit nonparametric rank-based correlation coefficient estimators, including spearman's rho kendall's tau. high dimensional settings, prove nonparanormal skeptic achieves optimal parametric rate convergence graph parameter estimation. celebrating result suggests gaussian copula graphical models used safe replacement popular gaussian graphical models, even data truly gaussian. besides theoretical analysis, also conduct thorough numerical simulations compare different estimators graph recovery performance ideal noisy settings. proposed methods applied large-scale genomic dataset illustrate empirical usefulness. r language software package huge implementing proposed methods available comprehensive r archive network: http://cran. r-project.org/.",19 "surrogate regret bounds bipartite ranking via strongly proper losses. problem bipartite ranking, instances labeled positive negative goal learn scoring function minimizes probability mis-ranking pair positive negative instances (or equivalently, maximizes area roc curve), widely studied recent years. dominant theoretical algorithmic framework problem reduce bipartite ranking pairwise classification; particular, well known bipartite ranking regret formulated pairwise classification regret, turn upper bounded using usual regret bounds classification problems. recently, kotlowski et al. (2011) showed regret bounds bipartite ranking terms regret associated balanced versions standard (non-pairwise) logistic exponential losses. paper, show (non-pairwise) surrogate regret bounds bipartite ranking obtained terms broad class proper (composite) losses term strongly proper. proof technique much simpler kotlowski et al. (2011), relies properties proper (composite) losses elucidated recently reid williamson (2010, 2011) others. result yields explicit surrogate bounds (with hidden balancing terms) terms variety strongly proper losses, including example logistic, exponential, squared squared hinge losses special cases. also obtain tighter surrogate bounds certain low-noise conditions via recent result clemencon robbiano (2011).",4 "discriminative optimization: theory applications computer vision problems. many computer vision problems formulated optimization cost function. approach faces two main challenges: (i) designing cost function local optimum acceptable solution, (ii) developing efficient numerical method search one (or multiple) local optima. designing functions feasible noiseless case, stability location local optima mostly unknown noise, occlusion, missing data. practice, result undesirable local optima local optimum expected place. hand, numerical optimization algorithms high-dimensional spaces typically local often rely expensive first second order information guide search. overcome limitations, paper proposes discriminative optimization (do), method learns search directions data without need cost function. specifically, explicitly learns sequence updates search space leads stationary points correspond desired solutions. provide formal analysis illustrate benefits problem 3d point cloud registration, camera pose estimation, image denoising. show performed comparably outperformed state-of-the-art algorithms terms accuracy, robustness perturbations, computational efficiency.",4 "statistical learning arbitrary computable classifiers. statistical learning theory chiefly studies restricted hypothesis classes, particularly finite vapnik-chervonenkis (vc) dimension. fundamental quantity interest sample complexity: number samples required learn specified level accuracy. consider learning set computable labeling functions. since vc-dimension infinite priori (uniform) bounds number samples impossible, let learning algorithm decide seen sufficient samples learned. first show learning setting indeed possible, develop learning algorithm. show, however, bounding sample complexity independently distribution impossible. notably, impossibility entirely due requirement learning algorithm computable, due statistical nature problem.",4 "discrete dynamical genetic programming xcs. number representation schemes presented use within learning classifier systems, ranging binary encodings neural networks. paper presents results investigation using discrete dynamical system representation within xcs learning classifier system. particular, asynchronous random boolean networks used represent traditional condition-action production system rules. shown possible use self-adaptive, open-ended evolution design ensemble discrete dynamical systems within xcs solve number well-known test problems.",4 "coarse fine non-rigid registration: chain scale-specific neural networks multimodal image alignment application remote sensing. tackle problem multimodal image non-rigid registration, prime importance remote sensing medical imaging. difficulties encountered classical registration approaches include feature design slow optimization gradient descent. analyzing methods, note significance notion scale. design easy-to-train, fully-convolutional neural networks able learn scale-specific features. chained appropriately, perform global registration linear time, getting rid gradient descent schemes predicting directly deformation.we show performance terms quality speed various tasks remote sensing multimodal image alignment. particular, able register correctly cadastral maps buildings well road polylines onto rgb images, outperform current keypoint matching methods.",4 "multigrid rough coefficients multiresolution operator decomposition hierarchical information games. introduce near-linear complexity (geometric meshless/algebraic) multigrid/multiresolution method pdes rough ($l^\infty$) coefficients rigorous a-priori accuracy performance estimates. method discovered decision/game theory formulation problems (1) identifying restriction interpolation operators (2) recovering signal incomplete measurements based norm constraints image linear operator (3) gambling value solution pde based hierarchy nested measurements solution source term. resulting elementary gambles form hierarchy (deterministic) basis functions $h^1_0(\omega)$ (gamblets) (1) orthogonal across subscales/subbands respect scalar product induced energy norm pde (2) enable sparse compression solution space $h^1_0(\omega)$ (3) induce orthogonal multiresolution operator decomposition. operating diagram multigrid method inverted pyramid gamblets computed locally (by virtue exponential decay), hierarchically (from fine coarse scales) pde decomposed hierarchy independent linear systems uniformly bounded condition numbers. resulting algorithm parallelizable space (via localization) bandwith/subscale (subscales computed independently other). although method deterministic natural bayesian interpretation measure probability emerging (as mixed strategy) information game formulation multiresolution approximations form martingale respect filtration induced hierarchy nested measurements.",12 "inductive sparse subspace clustering. sparse subspace clustering (ssc) achieved state-of-the-art clustering quality performing spectral clustering $\ell^{1}$-norm based similarity graph. however, ssc transductive method handle data used construct graph (out-of-sample data). new datum, ssc requires solving $n$ optimization problems o(n) variables performing algorithm whole data set, $n$ number data points. therefore, inefficient apply ssc fast online clustering scalable graphing. letter, propose inductive spectral clustering algorithm, called inductive sparse subspace clustering (issc), makes ssc feasible cluster out-of-sample data. issc adopts assumption high-dimensional data actually lie low-dimensional manifold out-of-sample data could grouped embedding space learned in-sample data. experimental results show issc promising clustering out-of-sample data.",4 "multilevel context representation improving object recognition. work, propose combined usage low- high-level blocks convolutional neural networks (cnns) improving object recognition. recent research focused either propagating context layers, e.g. resnet, (including low-level layers) multiple loss layers (e.g. googlenet), importance features close higher layers ignored. paper postulates use context closer high-level layers provides scale translation invariance works better using top layer only. particular, extend alexnet googlenet additional connections top $n$ layers. order demonstrate effectiveness proposed approach, evaluated standard imagenet task. relative reduction classification error around 1-2% without affecting computational cost. furthermore, show approach orthogonal typical test data augmentation techniques, recently introduced szegedy et al. (leading runtime reduction 144 test time).",4 "fostering user engagement: rhetorical devices applause generation learnt ted talks. one problem every presenter faces delivering public discourse hold listeners' attentions keep involved. therefore, many studies conversation analysis work issue suggest qualitatively con-structions effectively lead audience's applause. investigate proposals quantitatively, study an-alyze transcripts 2,135 ted talks, particular fo-cus rhetorical devices used presenters applause elicitation. conducting regression anal-ysis, identify interpret 24 rhetorical devices triggers audience applauding. build models rec-ognize applause-evoking sentences conclude work potential implications.",4 "new approach translation isolated units english-korean machine translation. effective way quick translation tremendous amount explosively increasing science technique information material develop practicable machine translation system introduce translation practice. essay treats problems arising translation isolated units basis practical materials experiments obtained development introduction english-korean machine translation system. words, essay considers establishment information isolated units korean equivalents word order.",4 "multiplicative algorithm orthgonal groups independent component analysis. multiplicative newton-like method developed author et al. extended situation dynamics restricted orthogonal group. general framework constructed without specifying cost function. though restriction orthogonal groups makes problem somewhat complicated, explicit expression amount individual jumps obtained. algorithm exactly second-order-convergent. global instability inherent newton method remedied levenberg-marquardt-type variation. method thus constructed readily applied independent component analysis. remarkable performance illustrated numerical simulation.",4 "resource allocation using metaheuristic search. research focused solving problems area software project management using metaheuristic search algorithms research field search based software engineering. main aim research evaluate performance different metaheuristic search techniques resource allocation scheduling problems would typical software development projects. paper reports set experiments evaluate performance three algorithms, namely simulated annealing, tabu search genetic algorithms. experimental results indicate metaheuristics search techniques used solve problems resource allocation scheduling within software project. finally, comparative analysis suggests overall genetic algorithm performed better simulated annealing tabu search.",4 "neural probabilistic model non-projective mst parsing. paper, propose probabilistic parsing model, defines proper conditional probability distribution non-projective dependency trees given sentence, using neural representations inputs. neural network architecture based bi-directional lstm-cnns benefits word- character-level representations automatically, using combination bidirectional lstm cnn. top neural network, introduce probabilistic structured layer, defining conditional log-linear model non-projective trees. evaluate model 17 different datasets, across 14 different languages. exploiting kirchhoff's matrix-tree theorem (tutte, 1984), partition functions marginals computed efficiently, leading straight-forward end-to-end model training procedure via back-propagation. parser achieves state-of-the-art parsing performance nine datasets.",4 "memristor crossbar-based hardware implementation ids method. ink drop spread (ids) engine active learning method (alm), methodology soft computing. ids, pattern-based processing unit, extracts useful information system subjected modeling. spite excellent potential solving problems classification modeling compared soft computing tools, finding simple fast hardware implementation still challenge. paper describes new hardware implementation ids method based memristor crossbar structure. addition simplicity, completely real-time, low latency ability continue working occurrence power breakdown advantages proposed circuit.",4 "associative long short-term memory. investigate new method augment recurrent neural networks extra memory without increasing number network parameters. system associative memory based complex-valued vectors closely related holographic reduced representations long short-term memory networks. holographic reduced representations limited capacity: store information, retrieval becomes noisier due interference. system contrast creates redundant copies stored information, enables retrieval reduced noise. experiments demonstrate faster learning multiple memorization tasks.",4 "understanding version multivariate symmetric uncertainty assist feature selection. paper, analyze behavior multivariate symmetric uncertainty (msu) measure use statistical simulation techniques various mixes informative non-informative randomly generated features. experiments show number attributes, cardinalities, sample size affect msu. discovered condition preserves good quality msu different combinations three factors, providing new useful criterion help drive process dimension reduction.",4 "combinatorial approach object analysis. present perceptional mathematical model image signal analysis. resemblance measure defined, submitted innovating combinatorial optimization algorithm. numerical simulations also presented",13 "analog simulator integro-differential equations classical memristors. analog computer makes use continuously changeable quantities system, electrical, mechanical, hydraulic properties, solve given problem. devices usually computationally powerful digital counterparts, suffer analog noise allow error control. focus analog computers based active electrical networks comprised resistors, capacitors, operational amplifiers capable simulating linear ordinary differential equation. however, class nonlinear dynamics solve limited. work, adding memristors electrical network, show analog computer simulate large variety linear nonlinear integro-differential equations carefully choosing conductance dynamics memristor state variable. study performance analog computers simulating integro-differential models fluid dynamics type, nonlinear volterra equations population growth, quantum models describing non-markovian memory effects, among others. finally, perform stability tests considering imperfect analog components, obtaining robust solutions $13\%$ relative error relevant timescales.",4 "whiteout: gaussian adaptive noise regularization feedforward neural networks. noise injection (ni) approach mitigate over-fitting feedforward neural networks (nns). bernoulli ni procedure implemented dropout shakeout connections $l_1$ $l_2$ regularization nn model parameters demonstrates efficiency feasibility ni regularizing nns. propose whiteout, new ni regularization technique adaptive gaussian noise nns. whiteout versatile dropout shakeout. show optimization objective function associated whiteout generalized linear models closed-form penalty term connections wide range regularization includes bridge, lasso, ridge, elastic net penalization special cases; also extended offer regularization similar adaptive lasso group lasso. prove whiteout also viewed robust learning nns presence small perturbations input hidden nodes. establish noise-perturbed empirical loss function whiteout converges almost surely ideal loss function, estimates nn parameters obtained minimizing former loss function consistent obtained minimizing ideal loss function. computationally, whiteout easily incorporated back-propagation algorithm. superiority whiteout dropout shakeout learning nns relatively small sized training data demonstrated using lsvt voice rehabilitation data libras hand movement data.",19 "theoretical insights optimization landscape over-parameterized shallow neural networks. paper study problem learning shallow artificial neural network best fits training data set. study problem over-parameterized regime number observations fewer number parameters model. show quadratic activations optimization landscape training shallow neural networks certain favorable characteristics allow globally optimal models found efficiently using variety local search heuristics. result holds arbitrary training data input/output pairs. differentiable activation functions also show gradient descent, suitably initialized, converges linear rate globally optimal model. result focuses realizable model inputs chosen i.i.d. gaussian distribution labels generated according planted weight coefficients.",4 "pursuit temporal accuracy general activity detection. detecting activities untrimmed videos important challenging task. performance existing methods remains unsatisfactory, e.g., often meet difficulties locating beginning end long complex action. paper, propose generic framework accurately detect wide variety activities untrimmed videos. first contribution novel proposal scheme efficiently generate candidates accurate temporal boundaries. contribution cascaded classification pipeline explicitly distinguishes relevance completeness candidate instance. two challenging temporal activity detection datasets, thumos14 activitynet, proposed framework significantly outperforms existing state-of-the-art methods, demonstrating superior accuracy strong adaptivity handling activities various temporal structures.",4 "accurate image super-resolution using deep convolutional networks. present highly accurate single-image super-resolution (sr) method. method uses deep convolutional network inspired vgg-net used imagenet classification \cite{simonyan2015very}. find increasing network depth shows significant improvement accuracy. final model uses 20 weight layers. cascading small filters many times deep network structure, contextual information large image regions exploited efficient way. deep networks, however, convergence speed becomes critical issue training. propose simple yet effective training procedure. learn residuals use extremely high learning rates ($10^4$ times higher srcnn \cite{dong2015image}) enabled adjustable gradient clipping. proposed method performs better existing methods accuracy visual improvements results easily noticeable.",4 "learn slow self-avoiding adaptive walks infinite radius search algorithm?. slow self-avoiding adaptive walks infinite radius search algorithm (limax) analyzed themselves, network form. study conducted several nk problems two hiff problems. find examination ""slacker"" walks networks indicate relative search difficulty within family problems, help identify potential local optima, detect presence structure fitness landscapes. hierarchical walks used differentiate rugged landscapes hierarchical (e.g. hiff) anarchic (e.g. nk). notion node viscidity measure local optimum potential introduced found quite successful although work needs done improve accuracy problems larger k.",4 "similarity-based estimation word cooccurrence probabilities. many applications natural language processing necessary determine likelihood given word combination. example, speech recognizer may need determine two word combinations ``eat peach'' ``eat beach'' likely. statistical nlp methods determine likelihood word combination according frequency training corpus. however, nature language many word combinations infrequent occur given corpus. work propose method estimating probability previously unseen word combinations using available information ``most similar'' words. describe probabilistic word association model based distributional word similarity, apply improving probability estimates unseen word bigrams variant katz's back-off model. similarity-based method yields 20% perplexity improvement prediction unseen bigrams statistically significant reductions speech-recognition error.",2 "unbiased data collection content exploitation/exploration strategy personalization. one missions personalization systems recommender systems show content items according users' personal interests. order achieve goal, systems learning user interests time trying present content items tailoring user profiles. recommending items according users' preferences investigated extensively past years, mainly thanks popularity netflix competition. real setting, users may attracted subset items interact them, leaving partial feedbacks system learn next cycle, leads significant biases systems hence results situation user engagement metrics cannot improved time. problem one component system. data collected users usually used many different tasks, including learning ranking functions, building user profiles constructing content classifiers. data biased, downstream use cases would impacted well. therefore, would beneficial gather unbiased data user interactions. traditionally, unbiased data collection done showing items uniformly sampling content pool. however, simple scheme feasible risks user engagement metrics takes long time gather user feedbacks. paper, introduce user-friendly unbiased data collection framework, utilizing methods developed exploitation exploration literature. discuss framework different normal multi-armed bandit problems method needed. layout novel thompson sampling bernoulli ranked-list effectively balance user experiences data collection. proposed method validated real bucket test show strong results comparing old algorithms",4 "estimation tissue microstructure using deep network inspired sparse reconstruction framework. diffusion magnetic resonance imaging (dmri) provides unique tool noninvasively probing microstructure neuronal tissue. noddi model popular approach estimation tissue microstructure many neuroscience studies. represents diffusion signals three types diffusion tissue: intra-cellular, extra-cellular, cerebrospinal fluid compartments. however, original noddi method uses computationally expensive procedure fit model could require large number diffusion gradients accurate microstructure estimation, may impractical clinical use. therefore, efforts devoted efficient accurate noddi microstructure estimation reduced number diffusion gradients. work, propose deep network based approach noddi microstructure estimation, named microstructure estimation using deep network (medn). motivated amico algorithm accelerates computation noddi parameters, formulate microstructure estimation problem dictionary-based framework. proposed network comprises two cascaded stages. first stage resembles solution dictionary-based sparse reconstruction problem second stage computes final microstructure using output first stage. weights two stages jointly learned training data, obtained training dmri scans diffusion gradients densely sample q-space. proposed method applied brain dmri scans, two shells 30 gradient directions (60 diffusion gradients total) used. estimation accuracy respect gold standard measured results demonstrate medn outperforms competing algorithms.",4 "new perspective boosting linear regression via subgradient optimization relatives. paper analyze boosting algorithms linear regression new perspective: modern first-order methods convex optimization. show classic boosting algorithms linear regression, namely incremental forward stagewise algorithm (fs$_\varepsilon$) least squares boosting (ls-boost($\varepsilon$)), viewed subgradient descent minimize loss function defined maximum absolute correlation features residuals. also propose modification fs$_\varepsilon$ yields algorithm lasso, may easily extended algorithm computes lasso path different values regularization parameter. furthermore, show new algorithms lasso may also interpreted master algorithm (subgradient descent), applied regularized version maximum absolute correlation loss function. derive novel, comprehensive computational guarantees several boosting algorithms linear regression (including ls-boost($\varepsilon$) fs$_\varepsilon$) using techniques modern first-order methods convex optimization. computational guarantees inform us statistical properties boosting algorithms. particular provide, first time, precise theoretical description amount data-fidelity regularization imparted running boosting algorithm prespecified learning rate fixed arbitrary number iterations, dataset.",12 "deep reinforcement learning time series: playing idealized trading games. deep q-learning investigated end-to-end solution estimate optimal strategies acting time series input. experiments conducted two idealized trading games. 1) univariate: input wave-like price time series, 2) bivariate: input includes random stepwise price time series noisy signal time series, positively correlated future price changes. univariate game tests whether agent capture underlying dynamics, bivariate game tests whether agent utilize hidden relation among inputs. stacked gated recurrent unit (gru), long short-term memory (lstm) units, convolutional neural network (cnn), multi-layer perceptron (mlp) used model q values. games, agents successfully find profitable strategy. gru-based agents show best overall performance univariate game, mlp-based agents outperform others bivariate game.",4 "deep structured model radius-margin bound 3d human activity recognition. understanding human activity challenging even recently developed 3d/depth sensors. solve problem, work investigates novel deep structured model, adaptively decomposes activity instance temporal parts using convolutional neural networks (cnns). model advances traditional deep learning approaches two aspects. first, { incorporate latent temporal structure deep model, accounting large temporal variations diverse human activities. particular, utilize latent variables decompose input activity number temporally segmented sub-activities, accordingly feed parts (i.e. sub-networks) deep architecture}. second, incorporate radius-margin bound regularization term deep model, effectively improves generalization performance classification. model training, propose principled learning algorithm iteratively (i) discovers optimal latent variables (i.e. ways activity decomposition) training instances, (ii) { updates classifiers} based generated features, (iii) updates parameters multi-layer neural networks. experiments, approach validated several complex scenarios human activity recognition demonstrates superior performances state-of-the-art approaches.",4 "spectral learning dynamic systems nonequilibrium data. observable operator models (ooms) related models one important powerful tools modeling analyzing stochastic systems. exactly describe dynamics finite-rank systems efficiently consistently estimated spectral learning assumption identically distributed data. paper, investigate properties spectral learning without assumption due requirements analyzing large-time scale systems, show equilibrium dynamics system extracted nonequilibrium observation data imposing equilibrium constraint. addition, propose binless extension spectral learning continuous data. comparison continuous-valued spectral algorithms, binless algorithm achieve consistent estimation equilibrium dynamics linear complexity.",4 "information content versus word length random typing. recently, claimed linear relationship measure information content word length expected word length optimization shown linearity supported strong correlation information content word length many languages (piantadosi et al. 2011, pnas 108, 3825-3826). here, study detail connections measure standard information theory. relationship measure word length studied popular random typing process text constructed pressing keys random keyboard containing letters space behaving word delimiter. although random process optimize word lengths according information content, exhibits linear relationship information content word length. exact slope intercept presented three major variants random typing process. strong correlation information content word length simply arise units making word (e.g., letters) necessarily interplay word context proposed piantadosi et al. itself, linear relation entail results optimization process.",15 "spatial modeling oil exploration areas using neural networks anfis gis. exploration hydrocarbon resources highly complicated expensive process various geological, geochemical geophysical factors developed combined together. highly significant design seismic data acquisition survey locate exploratory wells since incorrect imprecise locations lead waste time money operation. objective study locate high-potential oil gas field 1: 250,000 sheet ahwaz including 20 oil fields reduce time costs exploration production processes. regard, 17 maps developed using gis functions factors including: minimum maximum total organic carbon (toc), yield potential hydrocarbons production (pp), tmax peak, production index (pi), oxygen index (oi), hydrogen index (hi) well presence proximity high residual bouguer gravity anomalies, proximity anticline axis faults, topography curvature maps obtained asmari formation subsurface contours. model integrate maps, study employed artificial neural network adaptive neuro-fuzzy inference system (anfis) methods. results obtained model validation demonstrated 17x10x5 neural network r=0.8948, rms=0.0267, kappa=0.9079 trained better models anfis predicts potential areas accurately. however, method failed predict oil fields wrongly predict areas potential zones.",19 "neural motifs: scene graph parsing global context. investigate problem producing structured graph representations visual scenes. work analyzes role motifs: regularly appearing substructures scene graphs. present new quantitative insights repeated structures visual genome dataset. analysis shows object labels highly predictive relation labels vice-versa. also find recurring patterns even larger subgraphs: 50% graphs contain motifs involving least two relations. analysis leads new baseline simple, yet strikingly powerful. hardly considering overall visual context image, outperforms previous approaches. introduce stacked motif networks, new architecture encoding global context crucial capturing higher order motifs scene graphs. best model scene graph detection achieves 7.3% absolute improvement recall@50 (41% relative gain) prior state-of-the-art.",4 "active detection localization textureless objects cluttered environments. paper introduces active object detection localization framework combines robust untextured object detection 3d pose estimation algorithm novel next-best-view selection strategy. address detection localization problems proposing edge-based registration algorithm refines object position minimizing cost directly extracted 3d image tensor encodes minimum distance edge point joint direction/location space. face next-best-view problem exploiting sequential decision process that, step, selects next camera position maximizes mutual information state next observations. solve intrinsic intractability solution generating observations represent scene realizations, i.e. combination samples object hypothesis provided object detector, modeling state means set constantly resampled particles. experiments performed different real world, challenging datasets confirm effectiveness proposed methods.",4 "possibilistic model qualitative sequential decision problems uncertainty partially observable environments. article propose qualitative (ordinal) counterpart partially observable markov decision processes model (pomdp) uncertainty, well preferences agent, modeled possibility distributions. qualitative counterpart pomdp model relies possibilistic theory decision uncertainty, recently developed. one advantage qualitative framework ability escape classical obstacle stochastic pomdps, even finite state space, obtained belief state space pomdp infinite. instead, possibilistic framework even exponentially larger state space, belief state space remains finite.",4 "notes electronic lexicography. notes continuation topics covered v. selegej article ""electronic dictionaries computational lexicography"". electronic dictionary object description closely related languages? obviously, question allows multiple answers.",4 "bidirectional long-short term memory video description. video captioning attracting broad research attention multimedia community. however, existing approaches either ignore temporal information among video frames employ local contextual temporal knowledge. work, propose novel video captioning framework, termed \emph{bidirectional long-short term memory} (bilstm), deeply captures bidirectional global temporal structure video. specifically, first devise joint visual modelling approach encode video data combining forward lstm pass, backward lstm pass, together visual features convolutional neural networks (cnns). then, inject derived video representation subsequent language model initialization. benefits two folds: 1) comprehensively preserving sequential visual information; 2) adaptively learning dense visual features sparse semantic representations videos sentences, respectively. verify effectiveness proposed video captioning framework commonly-used benchmark, i.e., microsoft video description (msvd) corpus, experimental results demonstrate superiority proposed approach compared several state-of-the-art methods.",4 "analysis visualisation rdf resources ondex. ondex data integration visualization platform developed support systems biology research. core data model based two main principles: first, information represented graph and, second, elements graph annotated ontologies. data model conformant semantic web framework, particular rdf, therefore ondex ideally positioned platform exploit semantic web.",4 "variational depth focus reconstruction. paper deals problem reconstructing depth map sequence differently focused images, also known depth focus shape focus. propose state depth focus problem variational problem including smooth nonconvex data fidelity term, convex nonsmooth regularization, makes method robust noise leads realistic depth maps. additionally, propose solve nonconvex minimization problem linearized alternating directions method multipliers (admm), allowing minimize energy efficiently. numerical comparison classical methods simulated well real data presented.",4 "using robdds inference bayesian networks troubleshooting example. using bayesian networks modelling behavior man-made machinery, usually happens large part model deterministic. bayesian networks deterministic part model represented boolean function, central part belief updating reduces task calculating number satisfying configurations boolean function. paper explore advances calculation boolean functions adopted belief updating, particular within context troubleshooting. present experimental results indicating substantial speed-up compared traditional junction tree propagation.",4 "beyond pixels regions: non local patch means (nlpm) method content-level restoration, enhancement, reconstruction degraded document images. patch-based non-local restoration reconstruction method preprocessing degraded document images introduced. method collects relative data whole input image, image data first represented content-level descriptor based patches. patch-equivalent representation input image corrected based similar patches identified using modified genetic algorithm (ga) resulting low computational load. corrected patch-equivalent converted output restored image. fact method uses patches content level allows incorporate high-level restoration objective self-sufficient way. method applied several degraded document images, including dibco'09 contest dataset promising results.",4 "sdna: stochastic dual newton ascent empirical risk minimization. propose new algorithm minimizing regularized empirical loss: stochastic dual newton ascent (sdna). method dual nature: iteration update random subset dual variables. however, unlike existing methods stochastic dual coordinate ascent, sdna capable utilizing curvature information contained examples, leads striking improvements theory practice - sometimes orders magnitude. special case l2-regularizer used primal, dual problem concave quadratic maximization problem plus separable term. regime, sdna step solves proximal subproblem involving random principal submatrix hessian quadratic function; whence name method. if, addition, loss functions quadratic, method interpreted novel variant recently introduced iterative hessian sketch.",4 "stochastic reformulations linear systems: algorithms convergence theory. develop family reformulations arbitrary consistent linear system stochastic problem. reformulations governed two user-defined parameters: positive definite matrix defining norm, arbitrary discrete continuous distribution random matrices. reformulation several equivalent interpretations, allowing researchers various communities leverage domain specific insights. particular, reformulation equivalently seen stochastic optimization problem, stochastic linear system, stochastic fixed point problem probabilistic intersection problem. prove sufficient, necessary sufficient conditions reformulation exact. further, propose analyze three stochastic algorithms solving reformulated problem---basic, parallel accelerated methods---with global linear convergence rates. rates interpreted condition numbers matrix depends system matrix reformulation parameters. gives rise new phenomenon call stochastic preconditioning, refers problem finding parameters (matrix distribution) leading sufficiently small condition number. basic method equivalently interpreted stochastic gradient descent, stochastic newton method, stochastic proximal point method, stochastic fixed point method, stochastic projection method, fixed stepsize (relaxation parameter), applied reformulations.",12 "benefits output sparsity multi-label classification. multi-label classification framework, observation associated set labels, generated tremendous amount attention recent years. modern multi-label problems typically large-scale terms number observations, features labels, amount labels even comparable amount observations. context, different remedies proposed overcome curse dimensionality. work, aim exploiting output sparsity introducing new loss, called sparse weighted hamming loss. proposed loss seen weighted version classical ones, active inactive labels weighted separately. leveraging influence sparsity loss function, provide improved generalization bounds empirical risk minimizer, suitable property large-scale problems. new loss, derive rates convergence linear underlying output-sparsity rather linear number labels. practice, minimizing associated risk performed efficiently using convex surrogates modern convex optimization algorithms. provide experiments various real-world datasets demonstrating pertinence approach compared non-weighted techniques.",12 "scale-invariance ruggedness measures fractal fitness landscapes. paper deals using chaos direct trajectories targets analyzes ruggedness fractality resulting fitness landscapes. targeting problem formulated dynamic fitness landscape four different chaotic maps generating landscape studied. using computational approach, analyze properties landscapes quantify fractal rugged characteristics. particular, shown ruggedness measures correlation length information content scale-invariant self-similar.",13 "npglm: non-parametric method temporal link prediction. paper, try solve problem temporal link prediction information networks. implies predicting time takes link appear future, given features extracted current network snapshot. end, introduce probabilistic non-parametric approach, called ""non-parametric generalized linear model"" (np-glm), infers hidden underlying probability distribution link advent time given features. present learning algorithm np-glm inference method answer time-related queries. extensive experiments conducted synthetic data real-world sina weibo social network demonstrate effectiveness np-glm solving temporal link prediction problem vis-a-vis competitive baselines.",4 "publishing linking transport data web. without linked data, transport data limited applications exclusively around transport. paper, present workflow publishing linking transport data web. able develop transport applications add features created datasets. possible transport data linked datasets. apply workflow two datasets: neptune, french standard describing transport line, passim, directory containing relevant information transport services, every french city.",4 "inducing interpretability knowledge graph embeddings. study problem inducing interpretability kg embeddings. specifically, explore universal schema (riedel et al., 2013) propose method induce interpretability. many vector space models proposed problem, however, methods address interpretability (semantics) individual dimensions. work, study problem propose method inducing interpretability kg embeddings using entity co-occurrence statistics. proposed method significantly improves interpretability, maintaining comparable performance kg tasks.",4 "stochastic metamorphosis template uncertainties. paper, investigate two stochastic perturbations metamorphosis equations image analysis, geometrical context euler-poincar\'e theory. metamorphosis images, lie group diffeomorphisms deforms template image undergoing internal dynamics deforms. type deformation allows freedom image matching analogies complex fluids template properties regarded order parameters (coset spaces broken symmetries). first stochastic perturbation consider corresponds uncertainty due random errors reconstruction deformation map vector field. also consider second stochastic perturbation, compounds uncertainty deformation map uncertainty reconstruction template position velocity field. apply general geometric theory several classical examples, including landmarks, images, closed curves, discuss use functional data analysis.",4 "homotopy parametric simplex method sparse learning. high dimensional sparse learning imposed great computational challenge large scale data analysis. paper, interested broad class sparse learning approaches formulated linear programs parametrized {\em regularization factor}, solve parametric simplex method (psm). parametric simplex method offers significant advantages competing methods: (1) psm naturally obtains complete solution path values regularization parameter; (2) psm provides high precision dual certificate stopping criterion; (3) psm yields sparse solutions iterations, solution sparsity significantly reduces computational cost per iteration. particularly, demonstrate superiority psm various sparse learning approaches, including dantzig selector sparse linear regression, lad-lasso sparse robust linear regression, clime sparse precision matrix estimation, sparse differential network estimation, sparse linear programming discriminant (lpd) analysis. provide sufficient conditions psm always outputs sparse solutions computational performance significantly boosted. thorough numerical experiments provided demonstrate outstanding performance psm method.",4 "neuro-mathematical model geometrical optical illusions. geometrical optical illusions object many studies due possibility offer understand behaviour low-level visual processing. consist situations perceived geometrical properties object differ object visual stimulus. starting geometrical model introduced citti sarti [3], provide mathematical model computational algorithm allows interpret phenomena qualitatively reproduce perceived misperception.",4 "logic-based approach generatively defined discriminative modeling. conditional random fields (crfs) usually specified graphical models paper propose use probabilistic logic programs specify generatively. intension first provide unified approach crfs complex modeling use turing complete language second offer convenient way realizing generative-discriminative pairs machine learning compare generative discriminative models choose best model. implemented approach d-prism language modifying prism, logic-based probabilistic modeling language generative modeling, exploiting dynamic programming mechanism efficient probability computation. tested d-prism logistic regression, linear-chain crf crf-cfg empirically confirmed excellent discriminative performance compared generative counterparts, i.e.\ naive bayes, hmm pcfg. also introduced new crf models, crf-bncs crf-lcgs. crf versions bayesian network classifiers probabilistic left-corner grammars respectively easily implementable d-prism. empirically showed outperform generative counterparts expected.",4 "mdps unawareness. markov decision processes (mdps) widely used modeling decision-making problems robotics, automated control, economics. traditional mdps assume decision maker (dm) knows states actions. however, may true many situations interest. define new framework, mdps unawareness (mdpus) deal possibilities dm may aware possible actions. provide complete characterization dm learn play near-optimally mdpu, give algorithm learns play near-optimally possible so, efficiently possible. particular, characterize near-optimal solution found polynomial time.",4 "weighted unsupervised learning 3d object detection. paper introduces novel weighted unsupervised learning object detection using rgb-d camera. technique feasible detecting moving objects noisy environments captured rgb-d camera. main contribution paper real-time algorithm detecting object using weighted clustering separate cluster. preprocessing step, algorithm calculates pose 3d position x, y, z rgb color data point calculates data point's normal vector using point's neighbor. preprocessing, algorithm calculates k-weights data point; weight indicates membership. resulting clustered objects scene.",4 "prediction using note text: synthetic feature creation word2vec. word2vec affords simple yet powerful approach extracting quantitative variables unstructured textual data. half healthcare data unstructured therefore hard model without involved expertise data engineering natural language processing. word2vec serve bridge quickly gather intelligence data sources. study, ran 650 megabytes unstructured, medical chart notes providence health & services electronic medical record word2vec. used two different approaches creating predictive variables tested risk readmission patients copd (chronic obstructive lung disease). comparative benchmark, ran test using lace risk model (a single score based length stay, acuity, comorbid conditions, emergency department visits). using free text mathematical might, found word2vec comparable lace predicting risk readmission copd patients.",4 "polynomial value iteration algorithms detrerminstic mdps. value iteration commonly used empirically competitive method solving many markov decision process problems. however, known value iteration pseudo-polynomial complexity general. establish somewhat surprising polynomial bound value iteration deterministic markov decision (dmdp) problems. show basic value iteration procedure converges highest average reward cycle dmdp problem heta(n^2) iterations, heta(mn^2) total time, n denotes number states, number edges. give two extensions value iteration solve dmdp heta(mn) time. explore analysis policy iteration algorithms report empirical study value iteration showing convergence much faster random sparse graphs.",4 "empirically analyzing effect dataset biases deep face recognition systems. unknown kind biases modern wild face datasets lack annotation. direct consequence total recognition rates alone provide limited insight generalization ability deep convolutional neural networks (dcnns). propose empirically study effect different types dataset biases generalization ability dcnns. using synthetically generated face images, study face recognition rate function interpretable parameters face pose light. proposed method allows valuable details generalization performance different dcnn architectures observed compared. experiments, find that: 1) indeed, dataset bias significant influence generalization performance dcnns. 2) dcnns generalize surprisingly well unseen illumination conditions large sampling gaps pose variation. 3) uncover main limitation current dcnn architectures, difficulty generalize different identities share pose variation. 4) demonstrate findings synthetic data also apply learning real world data. face image generator publicly available enable community benchmark face recognition systems common ground.",4 "iterative algorithm fitting nonconvex penalized generalized linear models grouped predictors. high-dimensional data pose challenges statistical learning modeling. sometimes predictors naturally grouped pursuing between-group sparsity desired. collinearity may occur real-world high-dimensional applications popular $l_1$ technique suffers selection inconsistency prediction inaccuracy. moreover, problems interest often go beyond gaussian models. meet challenges, nonconvex penalized generalized linear models grouped predictors investigated simple-to-implement algorithm proposed computation. rigorous theoretical result guarantees convergence provides tight preliminary scaling. framework allows grouped predictors nonconvex penalties, including discrete $l_0$ `$l_0+l_2$' type penalties. penalty design parameter tuning nonconvex penalties examined. applications super-resolution spectrum estimation signal processing cancer classification joint gene selection bioinformatics show performance improvement nonconvex penalized estimation.",19 "classification ensembles neural networks. introduce new procedure training artificial neural networks using approximation objective function arithmetic mean ensemble selected randomly generated neural networks, apply procedure classification (or pattern recognition) problem. approach differs standard one based optimization theory. particular, neural network mentioned ensemble may approximation objective function.",4 "inferring location authors words texts. purposes computational dialectology geographically bound text analysis tasks, texts must annotated authors' location. many texts locatable explicit labels explicit annotation place. paper describes series experiments determine positionally annotated microblog posts used learn location-indicating words used locate blog texts authors. gaussian distribution used model locational qualities words. introduce notion placeness describe locational words are. find modelling word distributions account several locations thus several gaussian distributions per word, defining filter picks words high placeness based local distributional context, aggregating locational information centroid text gives useful results. results applied data swedish language.",4 "neutrality many-valued logics. book, consider various many-valued logics: standard, linear, hyperbolic, parabolic, non-archimedean, p-adic, interval, neutrosophic, etc. survey also results show tree different proof-theoretic frameworks many-valued logics, e.g. frameworks following deductive calculi: hilbert's style, sequent, hypersequent. present general way allows construct systematically analytic calculi large family non-archimedean many-valued logics: hyperrational-valued, hyperreal-valued, p-adic valued logics characterized special format semantics appropriate rejection archimedes' axiom. logics built different extensions standard many-valued logics (namely, lukasiewicz's, goedel's, product, post's logics). informal sense archimedes' axiom anything measured ruler. also logical multiple-validity without archimedes' axiom consists set truth values infinite well-founded well-ordered. base non-archimedean valued logics, construct non-archimedean valued interval neutrosophic logic inl describe neutrality phenomena.",4 "regression trees random forest based feature selection malaria risk exposure prediction. paper deals prediction anopheles number, main vector malaria risk, using environmental climate variables. variables selection based automatic machine learning method using regression trees, random forests combined stratified two levels cross validation. minimum threshold variables importance accessed using quadratic distance variables importance optimal subset selected variables used perform predictions. finally results revealed qualitatively better, selection, prediction , cpu time point view obtained glm-lasso method.",19 "mml consistent neyman-scott. strict minimum message length (smml) statistical inference method widely cited (but informal arguments) providing estimations consistent general estimation problems. is, however, almost invariably intractable compute, reason approximations (known mml algorithms) ever used practice. investigate neyman-scott estimation problem, oft-cited showcase consistency mml, show even natural choice prior, neither smml popular approximations consistent it, thereby providing counterexample general claim. first known explicit construction smml solution natural, high-dimensional problem. use novel construction methods refute claims regarding mml also appearing literature.",19 "planning based framework essay generation. generating article automatically computer program challenging task artificial intelligence natural language processing. paper, target essay generation, takes input topic word mind generates organized article theme topic. follow idea text planning \cite{reiter1997} develop essay generation framework. framework consists three components, including topic understanding, sentence extraction sentence reordering. component, studied several statistical algorithms empirically compared terms qualitative quantitative analysis. although run experiments chinese corpus, method language independent easily adapted language. lay remaining challenges suggest avenues future research.",4 "deep ordinal ranking multi-category diagnosis alzheimer's disease using hippocampal mri data. increasing effort brain image analysis dedicated early diagnosis alzheimer's disease (ad) based neuroimaging data. existing studies focusing binary classification problems, e.g., distinguishing ad patients normal control (nc) elderly mild cognitive impairment (mci) individuals nc elderly. however, identifying individuals ad mci, especially mci individuals convert ad (progressive mci, pmci), single setting, needed achieve goal early diagnosis ad. paper, propose deep ordinal ranking model distinguishing nc, stable mci (smci), pmci, ad individual subject level, taking account inherent ordinal severity brain degeneration caused normal aging, mci, ad, rather formulating classification multi-category classification problem. proposed deep ordinal ranking model focuses hippocampal morphology individuals learns informative discriminative features automatically. experiment results based large cohort individuals alzheimer's disease neuroimaging initiative (adni) indicate proposed method achieve better performance traditional multi-category classification techniques using shape radiomics features structural magnetic resonance imaging (mri) data.",4 "visualizing understanding neural models nlp. neural networks successfully applied many nlp tasks resulting vector-based models difficult interpret. example clear achieve {\em compositionality}, building sentence meaning meanings words phrases. paper describe four strategies visualizing compositionality neural models nlp, inspired similar work computer vision. first plot unit values visualize compositionality negation, intensification, concessive clauses, allow us see well-known markedness asymmetries negation. introduce three simple straightforward methods visualizing unit's {\em salience}, amount contributes final composed meaning: (1) gradient back-propagation, (2) variance token average word node, (3) lstm-style gates measure information flow. test methods sentiment using simple recurrent nets lstms. general-purpose methods may wide applications understanding compositionality semantic properties deep networks , also shed light lstms outperform simple recurrent nets,",4 "segmentation free object discovery video. paper present simple yet effective approach extend without supervision object proposal static images videos. unlike previous methods, spatio-temporal proposals, refer tracks, generated relying little visual content exploiting bounding boxes spatial correlations time. tracks obtain likely represent objects general-purpose tool represent meaningful video content wide variety tasks. unannotated videos, tracks used discover content without supervision. contribution also propose novel dataset-independent method evaluate generic object proposal based entropy classifier output response. experiment two competitive datasets, namely youtube objects ilsvrc-2015 vid.",4 "labelfusion: pipeline generating ground truth labels real rgbd data cluttered scenes. deep neural network (dnn) architectures shown outperform traditional pipelines object segmentation pose estimation using rgbd data, performance dnn pipelines directly tied representative training data true data. hence key requirement employing methods practice large set labeled data specific robotic manipulation task, requirement generally satisfied existing datasets. paper develop pipeline rapidly generate high quality rgbd data pixelwise labels object poses. use rgbd camera collect video scene multiple viewpoints leverage existing reconstruction techniques produce 3d dense reconstruction. label 3d reconstruction using human assisted icp-fitting object meshes. reprojecting results labeling 3d scene produce labels rgbd image scene. pipeline enabled us collect 1,000,000 labeled object instances days. use dataset answer questions related much training data required, quality data must be, achieve high performance dnn architecture.",4 automated assignment backbone nmr data using artificial intelligence. nuclear magnetic resonance (nmr) spectroscopy powerful method investigation three-dimensional structures biological molecules proteins. determining protein structure essential understanding function alterations function lead disease. one major challenges post-genomic era obtain structural functional information many unknown proteins encoded thousands newly identified genes. goal research design algorithm capable automating analysis backbone protein nmr data implementing ai strategies greedy a* search.,4 "model virtual carrier immigration digital images region segmentation. novel model image segmentation proposed, inspired carrier immigration mechanism physical p-n junction. carrier diffusing drifting simulated proposed model, imitates physical self-balancing mechanism p-n junction. effect virtual carrier immigration digital images analyzed studied experiments test images real world images. sign distribution net carrier model's balance state exploited region segmentation. experimental results test images real-world images demonstrate self-adaptive meaningful gathering pixels suitable regions, prove effectiveness proposed method image region segmentation.",4 "evolutionary model turing machines. development large non-coding fraction eukaryotic dna phenomenon code-bloat field evolutionary computations show striking similarity. seems suggest (in presence mechanisms code growth) evolution complex code can't attained without maintaining large inactive fraction. test hypothesis performed computer simulations evolutionary toy model turing machines, studying relations among fitness coding/non-coding ratio varying mutation code growth rates. results suggest that, model, large reservoir non-coding states constitutes great (long term) evolutionary advantage.",16 "smash: physics-guided reconstruction collisions videos. collision sequences commonly used games entertainment add drama excitement. authoring even two body collisions real world difficult, one get timing object trajectories correctly synchronized. tedious trial-and-error iterations, objects actually made collide, difficult capture 3d. contrast, synthetically generating plausible collisions difficult requires adjusting different collision parameters (e.g., object mass ratio, coefficient restitution, etc.) appropriate initial parameters. present smash directly read appropriate collision parameters directly raw input video recordings. technically enable utilizing laws rigid body collision regularize problem lifting 2d trajectories physically valid 3d reconstruction collision. reconstructed sequences modified combined easily author novel plausible collisions. evaluate system range synthetic scenes demonstrate effectiveness method accurately reconstructing several complex real world collision events.",4 "image authentication based neural networks. neural network attracting researchers since past decades. properties, parameter sensitivity, random similarity, learning ability, etc., make suitable information protection, data encryption, data authentication, intrusion detection, etc. paper, investigating neural networks' properties, low-cost authentication method based neural networks proposed used authenticate images videos. authentication method detect whether images videos modified maliciously. firstly, chapter introduces neural networks' properties, parameter sensitivity, random similarity, diffusion property, confusion property, one-way property, etc. secondly, chapter gives introduction neural network based protection methods. thirdly, image video authentication scheme based neural networks presented, performances, including security, robustness efficiency, analyzed. finally, conclusions drawn, open issues field presented.",4 "want answers? reddit inspired study pose questions. questions form integral part everyday communication, offline online. getting responses questions others fundamental satisfying information need extending knowledge boundaries. question may represented using various factors social, syntactic, semantic, etc. hypothesize factors contribute varying degrees towards getting responses others given question. perform thorough empirical study measure effects factors using novel question answer dataset website reddit.com. best knowledge, first analysis kind important topic. also use sparse nonnegative matrix factorization technique automatically induce interpretable semantic factors question dataset. also document various patterns response prediction observe analysis data. instance, found preference-probing questions scantily answered. method robust capture latent response factors. hope make code datasets publicly available upon publication paper.",4 "experimental comparison several clustering initialization methods. examine methods clustering high dimensions. first part paper, perform experimental comparison three batch clustering algorithms: expectation-maximization (em) algorithm, winner take version em algorithm reminiscent k-means algorithm, model-based hierarchical agglomerative clustering. learn naive-bayes models hidden root node, using high-dimensional discrete-variable data sets (both real synthetic). find em algorithm significantly outperforms methods, proceed investigate effect various initialization schemes final solution produced em algorithm. initializations consider (1) parameters sampled uninformative prior, (2) random perturbations marginal distribution data, (3) output hierarchical agglomerative clustering. although methods substantially different, lead learned models strikingly similar quality.",4 "smoothed low rank sparse matrix recovery iteratively reweighted least squares minimization. work presents general framework solving low rank and/or sparse matrix minimization problems, may involve multiple non-smooth terms. iteratively reweighted least squares (irls) method fast solver, smooths objective function minimizes alternately updating variables weights. however, traditional irls solve sparse low rank minimization problem squared loss affine constraint. work generalizes irls solve joint/mixed low rank sparse minimization problems, essential formulations many tasks. concrete example, solve schatten-$p$ norm $\ell_{2,q}$-norm regularized low-rank representation (lrr) problem irls, theoretically prove derived solution stationary point (globally optimal $p,q\geq1$). convergence proof irls general previous one depends special properties schatten-$p$ norm $\ell_{2,q}$-norm. extensive experiments synthetic real data sets demonstrate irls much efficient.",4 "detecting adversarial samples using density ratio estimates. machine learning models, especially based deep architectures used everyday applications ranging self driving cars medical diagnostics. shown models dangerously susceptible adversarial samples, indistinguishable real samples human eye, adversarial samples lead incorrect classifications high confidence. impact adversarial samples far-reaching efficient detection remains open problem. propose use direct density ratio estimation efficient model agnostic measure detect adversarial samples. proposed method works equally well single multi-channel samples, different adversarial sample generation methods. also propose method use density ratio estimates generating adversarial samples added constraint preserving density ratio.",4 "joint framework argumentative text analysis incorporating domain knowledge. argumentation mining, several sub-tasks argumentation component type classification, relation classification. existing research tends solve sub-tasks separately, ignore close relation them. paper, present joint framework incorporating logical relation sub-tasks improve performance argumentation structure generation. design objective function combine predictions individual models sub-task solve problem constraints constructed background knowledge. evaluate proposed model two public corpora experiment results show model outperform baseline uses separate model significantly sub-task. model also shows advantages component-related sub-tasks compared state-of-the-art joint model based evidence graph.",4 "analyzing users' sentiment towards popular consumer industries brands twitter. social media serves unified platform users express thoughts subjects ranging daily lives opinion consumer brands products. users wield enormous influence shaping opinions consumers influence brand perception, brand loyalty brand advocacy. paper, analyze opinion 19m twitter users towards 62 popular industries, encompassing 12,898 enterprise consumer brands, well associated subject matter topics, via sentiment analysis 330m tweets period spanning month. find users tend positive towards manufacturing negative towards service industries. addition, tend positive negative interacting brands generally twitter. also find sentiment towards brands within industry varies greatly demonstrate using two industries use cases. addition, discover strong correlation topic sentiments different industries, demonstrating topic sentiments highly dependent context industry mentioned in. demonstrate value analysis order assess impact brands social media. hope initial study prove valuable researchers companies understanding users' perception industries, brands associated topics encourage research field.",4 "operator entity extraction mapreduce. dictionary-based entity extraction involves finding mentions dictionary entities text. text mentions often noisy, containing spurious missing words. efficient algorithms detecting approximate entity mentions follow one two general techniques. first approach build index entities perform index lookups document substrings. second approach recognizes number substrings generated documents explode large numbers, get around this, use filter prune many substrings match dictionary entity verify remaining substrings entity mentions dictionary entities, means text join. choice index-based approach filter & verification-based approach case-to-case decision best approach depends characteristics input entity dictionary, example frequency entity mentions. choosing right approach setting make substantial difference execution time. making choice however non-trivial parameters within approaches make space possible approaches large. paper, present cost-based operator making choice among execution plans entity extraction. since need deal large dictionaries even larger large datasets, operator developed implementations mapreduce distributed algorithms.",4 "fixed-point coordinate descent algorithms regularized kernel methods. paper, study two general classes optimization algorithms kernel methods convex loss function quadratic norm regularization, analyze convergence. first approach, based fixed-point iterations, simple implement analyze, easily parallelized. second, based coordinate descent, exploits structure additively separable loss functions compute solutions line searches closed form. instances general classes algorithms already incorporated state art machine learning software large scale problems. start solution characterization regularized problem, obtained using sub-differential calculus resolvents monotone operators, holds general convex loss functions regardless differentiability. two methodologies described paper regarded instances non-linear jacobi gauss-seidel algorithms, well-suited solve large scale problems.",4 "neutrosophic entropy five components. paper presents two variants penta-valued representation neutrosophic entropy. first extension kaufmann's formula second extension kosko's formula. based primary three-valued information represented degree truth, degree falsity degree neutrality built penta-valued representations better highlights specific features neutrosophic entropy. thus, highlight five features neutrosophic uncertainty ambiguity, ignorance, contradiction, neutrality saturation. five features supplemented seven partition unity adding two features neutrosophic certainty truth falsity. paper also presents particular forms neutrosophic entropy obtained case bifuzzy representations, intuitionistic fuzzy representations, paraconsistent fuzzy representations finally case fuzzy representations.",4 "online control false discovery rate decaying memory. online multiple testing problem, p-values corresponding different null hypotheses observed one one, decision whether reject current hypothesis must made immediately, next p-value observed. alpha-investing algorithms control false discovery rate (fdr), formulated foster stine, generalized applied many settings, including quality-preserving databases science multiple a/b multi-armed bandit tests internet commerce. paper improves class generalized alpha-investing algorithms (gai) four ways: (a) show uniformly improve power entire class monotone gai procedures awarding alpha-wealth rejection, giving win-win resolution recent dilemma raised javanmard montanari, (b) demonstrate incorporate prior weights indicate domain knowledge hypotheses likely non-null, (c) allow differing penalties false discoveries indicate hypotheses may important others, (d) define new quantity called decaying memory false discovery rate (mem-fdr) may meaningful truly temporal applications, alleviates problems describe refer ""piggybacking"" ""alpha-death"". gai++ algorithms incorporate four generalizations simultaneously, reduce powerful variants earlier algorithms weights decay set unity. finally, also describe simple method derive new online fdr rules based estimated false discovery proportion.",19 """liar, liar pants fire"": new benchmark dataset fake news detection. automatic fake news detection challenging problem deception detection, tremendous real-world political social impacts. however, statistical approaches combating fake news dramatically limited lack labeled benchmark datasets. paper, present liar: new, publicly available dataset fake news detection. collected decade-long, 12.8k manually labeled short statements various contexts politifact.com, provides detailed analysis report links source documents case. dataset used fact-checking research well. notably, new dataset order magnitude larger previously largest public fake news datasets similar type. empirically, investigate automatic fake news detection based surface-level linguistic patterns. designed novel, hybrid convolutional neural network integrate meta-data text. show hybrid approach improve text-only deep learning model.",4 "dotmark - benchmark discrete optimal transport. wasserstein metric earth mover's distance (emd) useful tool statistics, machine learning computer science many applications biological medical imaging, among others. especially light increasingly complex data, computation distances via optimal transport often limiting factor. inspired challenge, variety new approaches optimal transport proposed recent years along new methods comes need meaningful comparison. paper, introduce benchmark discrete optimal transport, called dotmark, designed serve neutral collection problems, discrete optimal transport methods tested, compared one another, brought limits large-scale instances. consists variety grayscale images, various resolutions classes, several types randomly generated images, classical test images real data microscopy. along dotmark present survey performance test cross section established methods ranging traditional algorithms, transportation simplex, recently developed approaches, shielding neighborhood method, including also comparison commercial solvers.",12 "generalised reichenbachian common cause systems. principle common cause claims improbable coincidence occurred, must exist common cause. generally taken mean positive correlations non-causally related events disappear conditioning action underlying common cause. extended interpretation principle, contrast, urges common causes called order explain positive deviations estimated correlation two events expected value correlation. aim paper provide extended reading principle general probabilistic model, capturing simultaneous action system multiple common causes. end, two distinct models elaborated, necessary sufficient conditions existence determined.",19 "ontological architecture orbital debris data. orbital debris problem presents opportunity inter-agency international cooperation toward mutually beneficial goals debris prevention, mitigation, remediation, improved space situational awareness (ssa). achieving goals requires sharing orbital debris ssa data. toward this, present ontological architecture orbital debris domain, taking steps creation orbital debris ontology (odo). purpose ontological system (i) represent general orbital debris ssa domain knowledge, (ii) structure, standardize needed, orbital data terminology, (iii) foster semantic interoperability data-sharing. hope (iv) contribute solving orbital debris problem, improving peaceful global ssa, ensuring safe space travel future generations.",4 "distributed weighted parameter averaging svm training big data. two popular approaches distributed training svms big data parameter averaging admm. parameter averaging efficient suffers loss accuracy increase number partitions, admm feature space accurate suffers slow convergence. paper, report hybrid approach called weighted parameter averaging (wpa), optimizes regularized hinge loss respect weights parameters. problem shown solving svm projected space. also demonstrate $o(\frac{1}{n})$ stability bound final hypothesis given wpa, using novel proof techniques. experimental results variety toy real world datasets show approach significantly accurate parameter averaging high number partitions. also seen proposed method enjoys much faster convergence compared admm features space.",4 "algorithms, initializations, convergence nonnegative matrix factorization. well known good initializations improve speed accuracy solutions many nonnegative matrix factorization (nmf) algorithms. many nmf algorithms sensitive respect initialization w h both. especially true algorithms alternating least squares (als) type, including two new als algorithms present paper. compare results six initialization procedures (two standard four new) als algorithms. lastly, discuss practical issue choosing appropriate convergence criterion.",4 "automatic detection diabetes diagnosis using feature weighted support vector machines based mutual information modified cuckoo search. diabetes major health problem developing developed countries incidence rising dramatically. study, investigate novel automatic approach diagnose diabetes disease based feature weighted support vector machines (fw-svms) modified cuckoo search (mcs). proposed model consists three stages: firstly, pca applied select optimal subset features set features. secondly, mutual information employed construct fwsvm weighting different features based degree importance. finally, since parameter selection plays vital role classification accuracy svms, mcs applied select best parameter values. proposed mi-mcs-fwsvm method obtains 93.58% accuracy uci dataset. experimental results demonstrate method outperforms previous methods giving accurate results also significantly speeding classification procedure.",4 "learning wasserstein loss. learning predict multi-label outputs challenging, many problems natural metric outputs used improve predictions. paper develop loss function multi-label learning, based wasserstein distance. wasserstein distance provides natural notion dissimilarity probability measures. although optimizing respect exact wasserstein distance costly, recent work described regularized approximation efficiently computed. describe efficient learning algorithm based regularization, well novel extension wasserstein distance probability measures unnormalized measures. also describe statistical learning bound loss. wasserstein loss encourage smoothness predictions respect chosen metric output space. demonstrate property real-data tag prediction problem, using yahoo flickr creative commons dataset, outperforming baseline use metric.",4 "robust global localization using clustered particle filtering. global mobile robot localization problem determining robot's pose environment, using sensor data, starting position unknown. family probabilistic algorithms known monte carlo localization (mcl) currently among popular methods solving problem. mcl algorithms represent robot's belief set weighted samples, approximate posterior probability robot located using bayesian formulation localization problem. article presents extension mcl algorithm, addresses problems localizing highly symmetrical environments; situation mcl often unable correctly track equally probable poses robot. problem arises fact sample sets mcl often become impoverished, samples generated according posterior likelihood. approach incorporates idea clusters samples modifies proposal distribution considering probability mass clusters. experimental results presented show new extension mcl algorithm successfully localizes symmetric environments ordinary mcl often fails.",4 "machine learning bioclimatic modelling. many machine learning (ml) approaches widely used generate bioclimatic models prediction geographic range organism function climate. applications prediction range shift organism, range invasive species influenced climate change important parameters understanding impact climate change. however, success machine learning-based approaches depends number factors. safely said particular ml technique effective applications success technique predominantly dependent application type problem, useful understand behavior ensure informed choice techniques. paper presents comprehensive review machine learning-based bioclimatic model generation analyses factors influencing success models. considering wide use statistical techniques, discussion also include conventional statistical techniques used bioclimatic modelling.",4 "constructing non-negative low rank sparse graph data-adaptive features. paper aims constructing good graph discovering intrinsic data structures semi-supervised learning setting. firstly, propose build non-negative low-rank sparse (referred nnlrs) graph given data representation. specifically, weights edges graph obtained seeking nonnegative low-rank sparse matrix represents data sample linear combination others. so-obtained nnlrs-graph capture global mixture subspaces structure (by low rankness) locally linear structure (by sparseness) data, hence generative discriminative. secondly, good features extremely important constructing good graph, propose learn data embedding matrix construct graph jointly within one framework, termed nnlrs embedded features (referred nnlrs-ef). extensive experiments three publicly available datasets demonstrate proposed method outperforms state-of-the-art graph construction method large margin semi-supervised classification discriminative analysis, verifies effectiveness proposed method.",4 "binary matrix completion using unobserved entries. matrix completion problem, aims recover complete matrix partial observations, one important problems machine learning field studied actively. however, discrepancy mainstream problem setting, assumes continuous-valued observations, practical applications recommendation systems sns link predictions observations take discrete even binary values. cope problem, davenport et al. (2014) proposed binary matrix completion (bmc) problem, observations quantized binary values. hsieh et al. (2015) proposed pu (positive unlabeled) matrix completion problem, extension bmc problem. problem targets setting cannot observe negative values, sns link predictions. construction method setting, introduced methodology classification problem, regarding matrix entry sample. risk, defines losses unobserved entries well, indicates possibility use unobserved entries. paper, motivated semi-supervised classification method recently proposed sakai et al. (2017), develop method bmc problem use positive, negative, unobserved entries, combining risks davenport et al. (2014) hsieh et al. (2015). best knowledge, first bmc method exploits kinds matrix entries. experimentally show appropriate mixture risks improves performance.",19 "hierarchical spatial transformer network. computer vision researchers expecting neural networks spatial transformation ability eliminate interference caused geometric distortion long time. emergence spatial transformer network makes dream come true. spatial transformer network variants handle global displacement well, lack ability deal local spatial variance. hence achieve better manner deformation neural network become pressing matter moment. address issue, analyze advantages disadvantages approximation theory optical flow theory, combine propose novel way achieve image deformation implement hierarchical convolutional neural network. new approach solves linear deformation along optical flow field model image deformation. experiments cluttered mnist handwritten digits classification image plane alignment, method outperforms baseline methods large margin.",4 "towards continuous knowledge learning engine chatbots. although chatbots popular recent years, still serious weaknesses limit scope applications. one major weakness cannot learn new knowledge conversation process, i.e., knowledge fixed beforehand cannot expanded updated conversation. paper, propose build general knowledge learning engine chatbots enable continuously interactively learn new knowledge conversations. time goes by, become knowledgeable better better learning conversation. model task open-world knowledge base completion problem propose novel technique called lifelong interactive learning inference (lili) solve it. lili works imitating humans acquire knowledge perform inference interactive conversation. experimental results show lili highly promising.",4 "solving ""false positives"" problem fraud prediction. paper, present automated feature engineering based approach dramatically reduce false positives fraud prediction. false positives plague fraud prediction industry. estimated 1 5 declared fraud actually fraud roughly 1 every 6 customers valid transaction declined past year. address problem, use deep feature synthesis algorithm automatically derive behavioral features based historical data card associated transaction. generate 237 features (>100 behavioral patterns) transaction, use random forest learn classifier. tested machine learning model data large multinational bank compared existing solution. unseen data 1.852 million transactions, able reduce false positives 54% provide savings 190k euros. also assess deploy solution, whether necessitates streaming computation real time scoring. found solution maintain similar benefits even historical features computed every 7 days.",4 "top-k query answering datalog+/- ontologies subjective reports (technical report). use preferences query answering, traditional databases ontology-based data access, recently received much attention, due many real-world applications. paper, tackle problem top-k query answering datalog+/- ontologies subject querying user's preferences collection (subjective) reports users. here, report consists scores list features, author's preferences among features, well information. theses pieces information every report combined, along querying user's preferences his/her trust report, rank query results. present two alternative rankings, along algorithms top-k (atomic) query answering rankings. also show that, suitable assumptions, algorithms run polynomial time data complexity. finally present general reports, associated sets atoms rather single atoms.",4 "automatic method finding topic boundaries. article outlines new method locating discourse boundaries based lexical cohesion graphical technique called dotplotting. application dotplotting discourse segmentation performed either manually, examining graph, automatically, using optimization algorithm. results two experiments involving automatically locating boundaries series concatenated documents presented. areas application future directions work also outlined.",2 "survey stealth malware: attacks, mitigation measures, steps toward autonomous open world solutions. professional, social, financial existences become increasingly digitized government, healthcare, military infrastructures rely computer technologies, present larger lucrative targets malware. stealth malware particular poses increased threat specifically designed evade detection mechanisms, spreading dormant, wild extended periods time, gathering sensitive information positioning high-impact zero-day attack. policing growing attack surface requires development efficient anti-malware solutions improved generalization detect novel types malware resolve occurrences little burden human experts possible. paper, survey malicious stealth technologies well existing solutions detecting categorizing countermeasures autonomously. machine learning offers promising potential increasingly autonomous solutions improved generalization new malware types, network level host level, findings suggest several flawed assumptions inherent recognition algorithms prevent direct mapping stealth malware recognition problem machine learning solution. notable flawed assumptions closed world assumption: sample belonging class outside static training set appear query time. present formalized adaptive open world framework stealth malware recognition relate mathematically research machine learning domains.",4 "restricted manipulation iterative voting: convergence condorcet efficiency. collective decision making, voting rule used take collective decision among group agents, manipulation one agents usually considered negative behavior avoided, least made computationally difficult agents perform. however, scenarios restricted form manipulation instead beneficial. paper consider iterative version several voting rules, step one agent allowed manipulate modifying ballot according set restricted manipulation moves computationally easy require little information performed. prove convergence iterative voting rules restricted manipulation allowed, present experiments showing iterative voting rules higher condorcet efficiency non-iterative version.",4 "improved sparse low-rank matrix estimation. address problem estimating sparse low-rank matrix noisy observation. propose objective function consisting data-fidelity term two parameterized non-convex penalty functions. further, show set parameters non-convex penalty functions, order ensure objective function strictly convex. proposed objective function better estimates sparse low-rank matrices convex method utilizes sum nuclear norm $\ell_1$ norm. derive algorithm (as instance admm) solve proposed problem, guarantee convergence provided scalar augmented lagrangian parameter set appropriately. demonstrate proposed method denoising audio signal adjacency matrix representing protein interactions `escherichia coli' bacteria.",12 "thickness mapping eleven retinal layers normal eyes using spectral domain optical coherence tomography. purpose. study conducted determine thickness map eleven retinal layers normal subjects spectral domain optical coherence tomography (sd-oct) evaluate association sex age. methods. mean regional retinal thickness 11 retinal layers obtained automatic three-dimensional diffusion-map-based method 112 normal eyes 76 iranian subjects. results. thickness map central foveal area layer 1, 3, 4 displayed minimum thickness (p<0.005 all). maximum thickness observed nasal fovea layer 1 (p<0.001) circular pattern parafoveal retinal area layers 2, 3 4 central foveal area layer 6 (p<0.001). temporal inferior quadrants total retinal thickness quadrants layer 1 significantly greater men women. surrounding eight sectors total retinal thickness limited number sectors layer 1 4 significantly correlated age. conclusion. sd-oct demonstrated three-dimensional thickness distribution retinal layers normal eyes. thickness layers varied sex age different sectors. variables considered evaluating macular thickness.",4 "learning executable neural semantic parser. paper describes neural semantic parser maps natural language utterances onto logical forms executed task-specific environment, knowledge base database, produce response. parser generates tree-structured logical forms transition-based approach combines generic tree-generation algorithm domain-general operations defined logical language. generation process modeled structured recurrent neural networks, provide rich encoding sentential context generation history making predictions. tackle mismatches natural language logical form tokens, various attention mechanisms explored. finally, consider different training settings neural semantic parser, including fully supervised training annotated logical forms given, weakly-supervised training denotations provided, distant supervision unlabeled sentences knowledge base available. experiments across wide range datasets demonstrate effectiveness parser.",4 "towards accurate multi-person pose estimation wild. propose method multi-person detection 2-d pose estimation achieves state-of-art results challenging coco keypoints task. simple, yet powerful, top-down approach consisting two stages. first stage, predict location scale boxes likely contain people; use faster rcnn detector. second stage, estimate keypoints person potentially contained proposed bounding box. keypoint type predict dense heatmaps offsets using fully convolutional resnet. combine outputs introduce novel aggregation procedure obtain highly localized keypoint predictions. also use novel form keypoint-based non-maximum-suppression (nms), instead cruder box-level nms, novel form keypoint-based confidence score estimation, instead box-level scoring. trained coco data alone, final system achieves average precision 0.649 coco test-dev set 0.643 test-standard sets, outperforming winner 2016 coco keypoints challenge recent state-of-art. further, using additional in-house labeled data obtain even higher average precision 0.685 test-dev set 0.673 test-standard set, 5% absolute improvement compared previous best performing method dataset.",4 "infinity computable probability. show, contrary classical supposition, process generating symbols according probability distribution need not, likelihood, produce given finite text finite time, even guaranteed produce text infinite time. result extends target-free text generation implications simulations probabilistic processes.",12 "expressive power word embeddings. seek better understand difference quality several publicly released embeddings. propose several tasks help distinguish characteristics different embeddings. evaluation sentiment polarity synonym/antonym relations shows embeddings able capture surprisingly nuanced semantics even absence sentence structure. moreover, benchmarking embeddings shows great variance quality characteristics semantics captured tested embeddings. finally, show impact varying number dimensions resolution dimension effective useful features captured embedding space. contributions highlight importance embeddings nlp tasks effect quality final results.",4 "multi-level coding efficiency improved quality image compression based ambtc. paper, proposed extended version absolute moment block truncation coding (ambtc) compress images. generally elements bitplane used variants block truncation coding (btc) size 1 bit. extended two bits proposed method. number statistical moments preserved reconstruct compressed also raised 2 4. hence, quality reconstructed images improved significantly 33.62 38.12 increase bpp 1. increased bpp (3) reduced 1.75in multiple levels: one level, dropping 4 elements bitplane away pixel values dropped elements easily interpolated much loss quality, level two, eight elements dropped reconstructed later level three, size statistical moments reduced. experiments carried standard images varying intensities. cases, proposed method outperforms existing ambtc technique terms psnr bpp.",4 "integrating atlas graph cut methods lv segmentation cardiac cine mri. magnetic resonance imaging (mri) evolved clinical standard-of-care imaging modality cardiac morphology, function assessment, guidance cardiac interventions. applications rely accurate extraction myocardial tissue blood pool imaging data. propose framework left ventricle (lv) segmentation cardiac cine-mri. first, segment lv blood pool using iterative graph cuts, subsequently use information segment myocardium. formulate segmentation procedure energy minimization problem graph subject shape prior obtained label propagation average atlas using affine registration. proposed framework validated 30 patient cardiac cine-mri datasets available stacom lv segmentation challenge yielded fast, robust, accurate segmentation results.",4 "hölder projective divergences. describe framework build distances measuring tightness inequalities, introduce notion proper statistical divergences improper pseudo-divergences. consider h\""older ordinary reverse inequalities, present two novel classes h\""older divergences pseudo-divergences encapsulate special case cauchy-schwarz divergence. report closed-form formulas statistical dissimilarities considering distributions belonging exponential family provided natural parameter space cone (e.g., multivariate gaussians), affine (e.g., categorical distributions). new classes h\""older distances invariant rescaling, thus require distributions normalized. finally, show compute statistical h\""older centroids respect divergences, carry center-based clustering toy experiments set gaussian distributions demonstrate empirically symmetrized h\""older divergences outperform symmetric cauchy-schwarz divergence.",4 "accumulated gradient normalization. work addresses instability asynchronous data parallel optimization. introducing novel distributed optimizer able efficiently optimize centralized model communication constraints. optimizer achieves pushing normalized sequence first-order gradients parameter server. implies magnitude worker delta smaller compared accumulated gradient, provides better direction towards minimum compared first-order gradients, turn also forces possible implicit momentum fluctuations aligned since make assumption workers contribute towards single minima. result, approach mitigates parameter staleness problem effectively since staleness asynchrony induces (implicit) momentum, achieves better convergence rate compared optimizers asynchronous easgd dynsgd, show empirically.",19 "implementing bayesian scheme revising belief commitments. previous work classifying complex ship images [1,2] evolved effort develop software tools building solving generic classification problems. managing uncertainty associated feature data evidence important issue endeavor. bayesian techniques managing uncertainty [7,12,13] proven useful managing several belief maintenance requirements classification problem solving. one requirement need give qualitative explanations believed. pearl [11] addresses need computing calls belief commitment-the probable instantiation hypothesis variables given evidence available. belief commitments computed, straightforward implementation pearl's procedure involves finding analytical solution often difficult optimization problems. describe efficient implementation procedure using tensor products solves problems enumeratively avoids need case case analysis. procedure thereby made practical use general case.",4 "spike timing precision neural error correction: local behavior. effects spike timing precision dynamical behavior error correction spiking neurons investigated. stationary discharges -- phase locked, quasiperiodic, chaotic -- induced simulated neuron presenting pacemaker presynaptic spike trains across model prototypical inhibitory synapse. reduced timing precision modeled jittering presynaptic spike times. aftereffects errors -- communication, missed presynaptic spikes -- determined comparing postsynaptic spike times simulations identical except presence absence errors. results show effects error vary greatly depending ongoing dynamical behavior. case phase lockings, high degree presynaptic spike timing precision provide significantly faster error recovery. non-locked behaviors, isolated missed spikes little discernible aftereffects (or even serve paradoxically reduce uncertainty postsynaptic spike timing), regardless presynaptic imprecision. suggests two possible categories error correction: high-precision locking rapid recovery low-precision non-locked error immunity.",16 "voi-aware mcts. uct, state-of-the art algorithm monte carlo tree search (mcts) games markov decision processes, based ucb1, sampling policy multi-armed bandit problem (mab) minimizes cumulative regret. however, search differs mab mcts usually final ""arm pull"" (the actual move selection) collects reward, rather ""arm pulls"". paper, mcts sampling policy based value information (voi) estimates rollouts suggested. empirical evaluation policy comparison ucb1 uct performed random mab instances well computer go.",4 "estimating success unsupervised image image translation. supervised learning, validation error unbiased estimator generalization (test) error complexity-based generalization bounds abundant, bounds exist learning mapping unsupervised way. result, training gans specifically using gans learning map domains completely unsupervised way, one forced select hyperparameters stopping epoch subjectively examining multiple options. propose novel bound predicting success unsupervised cross domain mapping methods, motivated recently proposed simplicity principle. bound applied expectation, comparing hyperparameters selecting stopping criterion, per sample, order predict success specific cross-domain translation. utility bound demonstrated extensive set experiments employing multiple recent algorithms. code available https://github.com/sagiebenaim/gan_bound .",4 "simple language model based pmi matrix approximations. study, introduce new approach learning language models training estimate word-context pointwise mutual information (pmi), deriving desired conditional probabilities pmi test time. specifically, show minor modifications word2vec's algorithm, get principled language models closely related well-established noise contrastive estimation (nce) based language models. compelling aspect approach models trained simple negative sampling objective function commonly used word2vec learn word embeddings.",4 "optimal bayesian network based solution scheme constrained stochastic on-line equi-partitioning problem. number intriguing decision scenarios revolve around partitioning collection objects optimize application specific objective function. problem generally referred object partitioning problem (opp) known np-hard. consider particularly challenging version opp, namely, stochastic on-line equi-partitioning problem (so-epp). so-epp, target partitioning unknown inferred purely observing on-line sequence object pairs. paired objects belong partition probability $p$ different partitions probability $1-p$, $p$ also unknown. additional complication, partitions required equal cardinality. previously, sub-optimal solution strategies proposed so- epp. paper, propose first optimal solution strategy. brief, scheme propose, bn-epp, founded bayesian network representation so-epp problems. based probabilistic reasoning, able infer underlying object partitioning optimal accuracy. also able simultaneously infer $p$, allowing us accelerate learning object pairs arrive. furthermore, scheme first support arbitrary constraints partitioning (constrained so-epp). optimal, bn-epp provides superior performance compared existing solution schemes. additionally introduce walk-bn-epp, novel walksat inspired algorithm solving large scale bn-epp problems. finally, provide bn-epp based solution problem order picking, representative real-life application bn-epp.",4 "predictive entropy search bayesian optimization unknown constraints. unknown constraints arise many types expensive black-box optimization problems. several methods proposed recently performing bayesian optimization constraints, based expected improvement (ei) heuristic. however, ei lead pathologies used constraints. example, case decoupled constraints---i.e., one independently evaluate objective constraints---ei encounter pathology prevents exploration. additionally, computing ei requires current best solution, may exist none data collected far satisfy constraints. contrast, information-based approaches suffer failure modes. paper, present new information-based method called predictive entropy search constraints (pesc). analyze performance pesc show compares favorably ei-based approaches synthetic benchmark problems, well several real-world examples. demonstrate pesc effective algorithm provides promising direction towards unified solution constrained bayesian optimization.",19 "sparse representation multivariate extremes applications anomaly ranking. extremes play special role anomaly detection. beyond inference simulation purposes, probabilistic tools borrowed extreme value theory (evt), angular measure, also used design novel statistical learning methods anomaly detection/ranking. paper proposes new algorithm based multivariate evt learn rank observations high dimensional space respect degree 'abnormality'. procedure relies original dimension-reduction technique extreme domain possibly produces sparse representation multivariate extremes allows gain insight dependence structure thereof, escaping curse dimensionality. representation output unsupervised methodology propose combined anomaly detection technique tailored non-extreme data. performs linearly dimension almost linearly data (in o(dn log n)), fits large scale problems. approach paper novel evt never used multivariate version field anomaly detection. illustrative experimental results provide strong empirical evidence relevance approach.",19 "issues communication game. interaction autonomous agents, communication analyzed game-theoretic terms. meaning game proposed formalize core intended communication sender sends message receiver attempts infer meaning intended sender. basic issues involved game natural language communication discussed, salience, grammaticality, common sense, common belief, together demonstration feasibility game-theoretic account language.",4 "off-policy learning eligibility traces: survey. framework markov decision processes, off-policy learning, problem learning linear approximation value function fixed policy one trajectory possibly generated policy. briefly review on-policy learning algorithms literature (gradient-based least-squares-based), adopting unified algorithmic view. then, highlight systematic approach adapting off-policy learning eligibility traces. leads known algorithms - off-policy lstd(\lambda), lspe(\lambda), td(\lambda), tdc/gq(\lambda) - suggests new extensions - off-policy fpkf(\lambda), brm(\lambda), gbrm(\lambda), gtd2(\lambda). describe comprehensive algorithmic derivation algorithms recursive memory-efficent form, discuss known convergence properties illustrate relative empirical behavior garnet problems. experiments suggest standard algorithms off-policy lstd(\lambda)/lspe(\lambda) - td(\lambda) feature space dimension large least-squares approach - perform best.",4 "discriminative clustering relative constraints. study problem clustering relative constraints, constraint specifies relative similarities among instances. particular, constraint $(x_i, x_j, x_k)$ acquired posing query: instance $x_i$ similar $x_j$ $x_k$? consider scenario answers queries based underlying (but unknown) class concept, aim discover via clustering. different existing methods consider constraints derived yes answers, also incorporate know responses. introduce discriminative clustering method relative constraints (dcrc) assumes natural probabilistic relationship instances, underlying cluster memberships, observed constraints. objective maximize model likelihood given constraints, meantime enforce cluster separation cluster balance also making use unlabeled instances. evaluated proposed method using constraints generated ground-truth class labels, (noisy) human judgments user study. experimental results demonstrate: 1) usefulness relative constraints, particular know answers considered; 2) improved performance proposed method state-of-the-art methods utilize either relative pairwise constraints; 3) robustness method presence noisy constraints, provided human judgement.",4 "gated recurrent networks seizure detection. recurrent neural networks (rnns) sophisticated units implement gating mechanism emerged powerful technique modeling sequential signals speech electroencephalography (eeg). latter focus paper. significant big data resource, known tuh eeg corpus (tueeg), recently become available eeg research, creating unique opportunity evaluate recurrent units task seizure detection. study, compare two types recurrent units: long short-term memory units (lstm) gated recurrent units (gru). evaluated using state art hybrid architecture integrates convolutional neural networks (cnns) rnns. also investigate variety initialization methods show initialization crucial since poorly initialized networks cannot trained. furthermore, explore regularization convolutional gated recurrent networks address problem overfitting. experiments revealed convolutional lstm networks achieve significantly better performance convolutional gru networks. convolutional lstm architecture proper initialization regularization delivers 30% sensitivity 6 false alarms per 24 hours.",6 "machine learning methods histopathological image analysis. abundant accumulation digital histopathological images led increased demand analysis, computer-aided diagnosis using machine learning techniques. however, digital pathological images related tasks issues considered. mini-review, introduce application digital pathological image analysis using machine learning algorithms, address problems specific analysis, propose possible solutions.",4 "propositional logic plausible reasoning: uniqueness theorem. consider question extending propositional logic logic plausible reasoning, posit four requirements extension satisfy. requirement property classical propositional logic preserved extended logic; such, requirements simpler less problematic used cox's theorem variants. cox's theorem, requirements imply extended logic must isomorphic (finite-set) probability theory. also obtain specific numerical values probabilities, recovering classical definition probability theorem, truth assignments satisfy premise playing role ""possible cases.""",4 "phased exploration greedy exploitation stochastic combinatorial partial monitoring games. partial monitoring games repeated games learner receives feedback might different adversary's move even reward gained learner. recently, general model combinatorial partial monitoring (cpm) games proposed \cite{lincombinatorial2014}, learner's action space exponentially large adversary samples moves bounded, continuous space, according fixed distribution. paper gave confidence bound based algorithm (gcb) achieves $o(t^{2/3}\log t)$ distribution independent $o(\log t)$ distribution dependent regret bounds. implementation algorithm depends two separate offline oracles distribution dependent regret additionally requires existence unique optimal action learner. adopting cpm model, first contribution phased exploration greedy exploitation (pege) algorithmic framework problem. different algorithms within framework achieve $o(t^{2/3}\sqrt{\log t})$ distribution independent $o(\log^2 t)$ distribution dependent regret respectively. crucially, framework needs simpler ""argmax"" oracle gcb distribution dependent regret require existence unique optimal action. second contribution another algorithm, pege2, combines gap estimation pege algorithm, achieve $o(\log t)$ regret bound, matching gcb guarantee removing dependence size learner's action space. however, like gcb, pege2 requires access offline oracles existence unique optimal action. finally, discuss algorithm efficiently applied cpm problem practical interest: namely, online ranking feedback top.",4 "prior-based hierarchical segmentation highlighting structures interest. image segmentation process partitioning image set meaningful regions according criteria. hierarchical segmentation emerged major trend regard favors emergence important regions different scales. hand, many methods allow us prior information position structures interest images. paper, present versatile hierarchical segmentation method takes account prior spatial information outputs hierarchical segmentation emphasizes contours regions interest preserving important structures image. several applications presented illustrate method versatility efficiency.",4 "enhancing use case points estimation method using soft computing techniques. software estimation crucial task software engineering. software estimation encompasses cost, effort, schedule, size. importance software estimation becomes critical early stages software life cycle details software revealed yet. several commercial non-commercial tools exist estimate software early stages. software effort estimation methods require software size one important metric inputs consequently, software size estimation early stages becomes essential. one approaches used two decades early size effort estimation called use case points. use case points method relies use case diagram estimate size effort software projects. although use case points method widely used, limitations might adversely affect accuracy estimation. paper presents techniques using fuzzy logic neural networks improve accuracy use case points method. results showed improvement 22% obtained using proposed approach.",4 "graph learning filtered signals: graph system diffusion kernel identification. paper introduces novel graph signal processing framework building graph-based models classes filtered signals. framework, graph-based modeling formulated graph system identification problem, goal learn weighted graph (a graph laplacian matrix) graph-based filter (a function graph laplacian matrices). order solve proposed problem, algorithm developed jointly identify graph graph-based filter (gbf) multiple signal/data observations. algorithm valid assumption gbfs one-to-one functions. proposed approach applied learn diffusion (heat) kernels, popular various fields modeling diffusion processes. addition, specific choices graph-based filters, proposed problem reduces graph laplacian estimation problem. experimental results demonstrate proposed algorithm outperforms current state-of-the-art methods. also implement framework real climate dataset modeling temperature signals.",4 "sample must: optimal functional sampling. examine fundamental problem models various active sampling setups, network tomography. analyze sampling multivariate normal distribution unknown expectation needs estimated: setup possible sample distribution given set linear functionals, difficulty addressed optimally select combinations achieve low estimation error. although problem heart field optimal design, efficient solutions case many functionals exist. present bounds efficient sub-optimal solution problem structured sets binary functionals induced graph walks.",19 "generating thematic chinese poetry using conditional variational autoencoders hybrid decoders. computer poetry generation first step towards computer writing. writing must theme. current approaches using sequence-to-sequence models attention often produce non-thematic poems. present novel conditional variational autoencoder hybrid decoder adding deconvolutional neural networks general recurrent neural networks fully learn topic information via latent variables. approach significantly improves relevance generated poems representing line poem context-sensitive manner also holistic way highly related given keyword learned topic. proposed augmented word2vec model improves rhythm symmetry. tests show generated poems approach mostly satisfying regulated rules consistent themes, 73.42% receive overall score less 3 (the highest score 5).",4 "geometric dirichlet means algorithm topic inference. propose geometric algorithm topic learning inference built convex geometry topics arising latent dirichlet allocation (lda) model nonparametric extensions. end study optimization geometric loss function, surrogate lda's likelihood. method involves fast optimization based weighted clustering procedure augmented geometric corrections, overcomes computational statistical inefficiencies encountered techniques based gibbs sampling variational inference, achieving accuracy comparable gibbs sampler. topic estimates produced method shown statistically consistent conditions. algorithm evaluated extensive experiments simulated real data.",19 "job detection twitter. report, propose new application twitter data called \textit{job detection}. identify people's job category based tweets. preliminary work, limited task identify workers job holders. used compared simple bag words model document representation based skip-gram model. results show model based skip-gram, achieves 76\% precision 82\% recall.",4 "attention-based models text-dependent speaker verification. attention-based models recently shown great performance range tasks, speech recognition, machine translation, image captioning due ability summarize relevant information expands entire length input sequence. paper, analyze usage attention mechanisms problem sequence summarization end-to-end text-dependent speaker recognition system. explore different topologies variants attention layer, compare different pooling methods attention weights. ultimately, show attention-based models improves equal error rate (eer) speaker verification system relatively 14% compared non-attention lstm baseline model.",6 "serious?: rhetorical questions sarcasm social media dialog. effective models social dialog must understand broad range rhetorical figurative devices. rhetorical questions (rqs) type figurative language whose aim achieve pragmatic goal, structuring argument, persuasive, emphasizing point, ironic. computational models forms figurative language, rhetorical questions received little attention date. expand small dataset previous work, presenting corpus 10,270 rqs debate forums twitter represent different discourse functions. show clearly distinguish rqs sincere questions (0.76 f1). show rqs used sarcastically non-sarcastically, observing non-sarcastic (other) uses rqs frequently argumentative forums, persuasive tweets. present experiments distinguish uses rqs using svm lstm models represent linguistic features post-level context, achieving results high 0.76 f1 ""sarcastic"" 0.77 f1 ""other"" forums, 0.83 f1 ""sarcastic"" ""other"" tweets. supplement quantitative experiments in-depth characterization linguistic variation rqs.",4 "introducing elitist black-box models: elitist selection weaken performance evolutionary algorithms?. black-box complexity theory provides lower bounds runtime black-box optimizers like evolutionary algorithms serves inspiration design new genetic algorithms. several black-box models covering different classes algorithms exist, highlighting different aspect algorithms considerations. work add existing black-box notions new \emph{elitist black-box model}, algorithms required base decisions solely (a fixed number of) best search points sampled far. model combines features ranking-based memory-restricted black-box models elitist selection. provide several examples elitist black-box complexity exponentially larger respective complexities previous black-box models, thus showing elitist black-box complexity much closer runtime typical evolutionary algorithms. also introduce concept $p$-monte carlo black-box complexity, measures time takes optimize problem failure probability $p$. even small~$p$, $p$-monte carlo black-box complexity function class $\mathcal f$ smaller exponential factor typically regarded las vegas complexity (which measures \emph{expected} time takes optimize $\mathcal f$).",4 "medical image analysis using convolutional neural networks: review. medical image analysis science analyzing solving medical problems using different image analysis techniques affective efficient extraction information. emerged one top research area field engineering medicine. recent years witnessed rapid use machine learning algorithms medical image analysis. machine learning techniques used extract compact information improved performance medical image analysis system, compared traditional methods use extraction handcrafted features. deep learning breakthrough machine learning techniques overwhelmed field pattern recognition computer vision research providing state-of-the-art results. deep learning provides different machine learning algorithms model high level data abstractions rely handcrafted features. recently, deep learning methods utilizing deep convolutional neural networks applied medical image analysis providing promising results. application area covers whole spectrum medical image analysis including detection, segmentation, classification, computer aided diagnosis. paper presents review state-of-the-art convolutional neural network based techniques used medical image analysis.",4 "clickbait detection using word embeddings. clickbait pejorative term describing web content aimed generating online advertising revenue, especially expense quality accuracy, relying sensationalist headlines eye-catching thumbnail pictures attract click-throughs encourage forwarding material online social networks. use distributed word representations words title features identify clickbaits online news media. train machine learning model using linear regression predict cickbait score given tweet. methods achieve f1-score 64.98\% mse 0.0791. compared methods, method simple, fast train, require extensive feature engineering yet moderately effective.",4 microwave imaging enhancement technique noisy synthetic data. inverse iterative algorithm microwave imaging based moment method solution presented here. iterative scheme developed constrained optimization technique certain converge. different mesh size model used overcome inverse crime. synthetic data receivers contaminated different percentage noise. ill-posedness problem solved levenberg-marquardt method. algorithm applied synthetic data reconstructed image enhanced image enhancement technique,4 "geometric proof calibration. provide yet another proof existence calibrated forecasters; two merits. first, valid arbitrary finite number outcomes. second, short simple follows direct application blackwell's approachability theorem carefully chosen vector-valued payoff function convex target set. proof captures essence existing proofs based approachability (e.g., proof foster, 1999 case binary outcomes) highlights intrinsic connection approachability calibration.",19 "videostory embeddings recognize events examples scarce. paper aims event recognition video examples scarce even completely absent. key challenging setting semantic video representation. rather building representation individual attribute detectors annotations, propose learn entire representation freely available web videos descriptions using embedding video features term vectors. proposed embedding, call videostory, correlations terms utilized learn effective representation optimizing joint objective balancing descriptiveness predictability.we show learning videostory using multimodal predictability loss, including appearance, motion audio features, results better predictable representation. also propose variant videostory recognize event video important terms text query introducing term sensitive descriptiveness loss. experiments three challenging collections web videos nist trecvid multimedia event detection columbia consumer videos datasets demonstrate: i) advantages videostory representations using attributes alternative embeddings, ii) benefit fusing video modalities embedding common strategies, iii) complementarity term sensitive descriptiveness multimodal predictability event recognition without examples. abilities improve predictability upon underlying video feature time maximizing semantic descriptiveness, videostory leads state-of-the-art accuracy few- zero-example recognition events video.",4 "biological gradient descent prediction combination stdp homeostatic plasticity. identifying, formalizing combining biological mechanisms implement known brain functions, prediction, main aspect current research theoretical neuroscience. letter, mechanisms spike timing dependent plasticity (stdp) homeostatic plasticity, combined original mathematical formalism, shown shape recurrent neural networks predictors. following rigorous mathematical treatment, prove implement online gradient descent distance network activity stimuli. convergence equilibrium, network spontaneously reproduce predict stimuli, suffer bifurcation issues usually encountered learning recurrent neural networks.",16 "understanding effective receptive field deep convolutional neural networks. study characteristics receptive fields units deep convolutional networks. receptive field size crucial issue many visual tasks, output must respond large enough areas image capture information large objects. introduce notion effective receptive field, show gaussian distribution occupies fraction full theoretical receptive field. analyze effective receptive field several architecture designs, effect nonlinear activations, dropout, sub-sampling skip connections it. leads suggestions ways address tendency small.",4 "small moving window calibration models soft sensing processes limited history. five simple soft sensor methodologies two update conditions compared two experimentally-obtained datasets one simulated dataset. soft sensors investigated moving window partial least squares regression (and recursive variant), moving window random forest regression, mean moving window $y$, novel random forest partial least squares regression ensemble (rf-pls), used small sample sizes rapidly placed online. found that, two datasets studied, small window sizes led lowest prediction errors moving window methods studied. majority datasets studied, rf-pls calibration method offered lowest one-step-ahead prediction errors compared methods, demonstrated greater predictive stability larger time delays moving window pls alone. found random forest rf-pls methods adequately modeled datasets feature purely monotonic increases property values, methods performed poorly moving window pls models one dataset purely monotonic property values. data dependent findings presented discussed.",19 probabilistic transmission expansion planning methodology based roulette wheel selection social welfare. new probabilistic methodology transmission expansion planning (tep) require priori specification new/additional transmission capacities uses concept social welfare proposed. two new concepts introduced paper: (i) roulette wheel methodology used calculate capacity new transmission lines (ii) load flow analysis used calculate expected demand served (edns). overall methodology implemented modified ieee 5-bus test system. simulations show important result: addition new transmission lines sufficient minimize edns.,4 "parcellation fmri datasets ica pls-a data driven approach. inter-subject parcellation functional magnetic resonance imaging (fmri) data based standard general linear model (glm)and spectral clustering recently proposed means alleviate issues associated spatial normalization fmri. however, appeal, glm-based parcellation approach introduces biases, form priori knowledge shape hemodynamic response function (hrf) task-related signal changes, subject behaviour task. paper, introduce data-driven version spectral clustering parcellation, based independent component analysis (ica) partial least squares (pls) instead glm. first, number independent components automatically selected. seed voxels obtained associated ica maps compute pls latent variables fmri signal seed voxels (which covers regional variations hrf) principal components signal across voxels. finally, parcellate subjects data spectral clustering pls latent variables. present results application proposed method single-subject multi-subject fmri datasets. preliminary experimental results, evaluated intra-parcel variance glm t-values pls derived t-values, indicate data-driven approach offers improvement terms parcellation accuracy glm based techniques.",4 "cheap bandits. consider stochastic sequential learning problems learner observe \textit{average reward several actions}. setting interesting many applications involving monitoring surveillance, set actions observe represent (geographical) area. importance setting applications, actually \textit{cheaper} observe average reward group actions rather reward single action. show reward \textit{smooth} given graph representing neighboring actions, maximize cumulative reward learning \textit{minimizing sensing cost}. paper propose cheapucb, algorithm matches regret guarantees known algorithms setting time guarantees linear cost them. by-product analysis, establish $\omega(\sqrt{dt})$ lower bound cumulative regret spectral bandits class graphs effective dimension $d$.",4 "play: retrieval video segments using natural-language queries. paper, propose new approach retrieval video segments using natural language queries. unlike previous approaches concept-based methods rule-based structured models, proposed method uses image captioning model construct sentential queries visual information. detail, approach exploits multiple captions generated visual features image `densecap'. then, similarities captions adjacent images calculated, used track semantically similar captions multiple frames. besides introducing novel idea 'tracking captioning', proposed method one first approaches uses language generation model learned neural networks construct semantic query describing relations properties visual information. evaluate effectiveness approach, created new evaluation dataset, contains 348 segments scenes 20 movie-trailers. quantitative qualitative evaluation, show method effective retrieval video segments using natural language queries.",4 "classification polar-thermal eigenfaces using multilayer perceptron human face recognition. paper presents novel approach handle challenges face recognition. work thermal face images considered, minimizes affect illumination changes occlusion due moustache, beards, adornments etc. proposed approach registers training testing thermal face images polar coordinate, capable handle complicacies introduced scaling rotation. polar images projected eigenspace finally classified using multi-layer perceptron. experiments used object tracking classification beyond visible spectrum (otcbvs) database benchmark thermal face images. experimental results show proposed approach significantly improves verification identification performance success rate 97.05%.",4 "maximum entropy deep inverse reinforcement learning. paper presents general framework exploiting representational capacity neural networks approximate complex, nonlinear reward functions context solving inverse reinforcement learning (irl) problem. show context maximum entropy paradigm irl lends naturally efficient training deep architectures. test time, approach leads computational complexity independent number demonstrations, makes especially well-suited applications life-long learning scenarios. approach achieves performance commensurate state-of-the-art existing benchmarks exceeding alternative benchmark based highly varying reward structures. finally, extend basic architecture - equivalent simplified subclass fully convolutional neural networks (fcnns) width one - include larger convolutions order eliminate dependency precomputed spatial features work raw input representations.",4 "effectiveness least squares generative adversarial networks. unsupervised learning generative adversarial networks (gans) proven hugely successful. regular gans hypothesize discriminator classifier sigmoid cross entropy loss function. however, found loss function may lead vanishing gradients problem learning process. overcome problem, propose paper least squares generative adversarial networks (lsgans) adopt least squares loss function discriminator. show minimizing objective function lsgan yields minimizing pearson $\chi^2$ divergence. also present theoretical analysis properties lsgans $\chi^2$ divergence. two benefits lsgans regular gans. first, lsgans able generate higher quality images regular gans. second, lsgans perform stable learning process. evaluating image quality, train lsgans several datasets including lsun cat dataset, experimental results show images generated lsgans better quality ones generated regular gans. furthermore, evaluate stability lsgans two groups. one compare lsgans regular gans without gradient penalty. conduct three experiments, including gaussian mixture distribution, difficult architectures, new proposed method --- datasets small variance, illustrate stability lsgans. one compare lsgans gradient penalty wgans gradient penalty (wgans-gp). experimental results show lsgans gradient penalty succeed training difficult architectures used wgans-gp, including 101-layer resnet.",4 "unsupervised learning monocular depth estimation visual odometry deep feature reconstruction. despite learning based methods showing promising results single view depth estimation visual odometry, existing approaches treat tasks supervised manner. recent approaches single view depth estimation explore possibility learning without full supervision via minimizing photometric error. paper, explore use stereo sequences learning depth visual odometry. use stereo sequences enables use spatial (between left-right pairs) temporal (forward backward) photometric warp error, constrains scene depth camera motion common, real-world scale. test time framework able estimate single view depth two-view odometry monocular sequence. also show improve standard photometric warp loss considering warp deep features. show extensive experiments that: (i) jointly training single view depth visual odometry improves depth prediction additional constraint imposed depths achieves competitive results visual odometry; (ii) deep feature-based warping loss improves upon simple photometric warp loss single view depth estimation visual odometry. method outperforms existing learning based methods kitti driving dataset tasks. source code available https://github.com/huangying-zhan/depth-vo-feat",4 "online learning switching costs adaptive adversaries. study power different types adaptive (nonoblivious) adversaries setting prediction expert advice, full-information bandit feedback. measure player's performance using new notion regret, also known policy regret, better captures adversary's adaptiveness player's behavior. setting losses allowed drift, characterize ---in nearly complete manner--- power adaptive adversaries bounded memories switching costs. particular, show switching costs, attainable rate bandit feedback $\widetilde{\theta}(t^{2/3})$. interestingly, rate significantly worse $\theta(\sqrt{t})$ rate attainable switching costs full-information case. via novel reduction experts bandits, also show bounded memory adversary force $\widetilde{\theta}(t^{2/3})$ regret even full information case, proving switching costs easier control bounded memory adversaries. lower bounds rely new stochastic adversary strategy generates loss processes strong dependencies.",4 "quality expectations machine translation. machine translation (mt) deployed range use-cases millions people daily basis. should, therefore, doubt utility mt. however, everyone convinced mt useful, especially productivity enhancer human translators. chapter, address issue, describing mt currently deployed, output evaluated could enhanced, especially mt quality improves. central issues acceptance longer single 'gold standard' measure quality, situation mt deployed needs borne mind, especially respect expected 'shelf-life' translation itself.",4 "technical report: image captioning semantically similar images. report presents submission ms coco captioning challenge 2015. method uses convolutional neural network activations embedding find semantically similar images. images, typical caption selected based unigram frequencies. although method received low scores automated evaluation metrics human assessed average correctness, competitive ratio captions pass turing test assessed better equal human captions.",4 "compressed model residual cnds. convolutional neural networks achieved great success recent years. although, way maximize performance convolutional neural networks still beginning. furthermore, optimization size time need train convolutional neural networks far away reaching researcher's ambition. paper, proposed new convolutional neural network combined several techniques boost optimization convolutional neural network aspects speed size. used previous model residual-cnds (rescnds), solved problems slower convergence, overfitting, degradation, compressed it. outcome model called residual-squeeze-cnds (ressqucnds), demonstrated sold technique add residual learning model compressing convolutional neural networks. model compressing adapted squeezenet model, model generalizable, applied almost neural network model, fully integrated residual learning, addresses problem degradation successfully. proposed model trained large-scale mit places365-standard scene datasets, backing hypothesis new compressed model inherited best previous rescnds8 model, almost get accuracy validation top-1 top-5 87.64% smaller size 13.33% faster training time.",4 "bipartite graph matching keyframe summary evaluation. keyframe summary, ""static storyboard"", collection frames video designed summarise semantic content. many algorithms proposed extract summaries automatically. best evaluate outputs important little-discussed question. review current methods matching frames two summaries formalism graph theory. analysis revealed different behaviours methods, illustrate number case studies. based results, recommend greedy matching algorithm due kannappan et al.",4 "specious rules: efficient effective unifying method removing misleading uninformative patterns association rule mining. present theoretical analysis suite tests procedures addressing broad class redundant misleading association rules call \emph{specious rules}. specious dependencies, also known \emph{spurious}, \emph{apparent}, \emph{illusory associations}, refer well-known phenomenon marginal dependencies merely products interactions variables disappear conditioned variables. extreme example yule-simpson's paradox two variables present positive dependence marginal contingency table negative partial tables defined different levels confounding factor. accepted wisdom data nontrivial dimensionality infeasible control exponentially many possible confounds nature. paper, consider problem specious dependencies context statistical association rule mining. define specious rules show offer unifying framework covers many types previously proposed redundant misleading association rules. theoretical analysis, introduce practical algorithms detecting pruning specious association rules efficiently many key goodness measures, including mutual information exact hypergeometric probabilities. demonstrate procedure greatly reduces number associations discovered, providing elegant effective solution problem association mining discovering large numbers misleading redundant rules.",4 "lipschitz exploration-exploitation scheme bayesian optimization. problem optimizing unknown costly-to-evaluate functions studied long time context bayesian optimization. algorithms field aim find optimizer function asking function evaluations locations carefully selected based posterior model. paper, assume unknown function lipschitz continuous. leveraging lipschitz property, propose algorithm distinct exploration phase followed exploitation phase. exploration phase aims select samples shrink search space much possible. exploitation phase focuses reduced search space selects samples closest optimizer. considering expected improvement (ei) baseline, empirically show proposed algorithm significantly outperforms ei.",4 "evaluation deep learning abstract image classification dataset. convolutional neural networks become state art methods image classification last couple years. perform better human subjects many image classification datasets. datasets based notion concrete classes (i.e. images classified type object image). paper present novel image classification dataset, using abstract classes, easy solve humans, variations challenging cnns. classification performance popular cnn architectures evaluated dataset variations dataset might interesting research identified.",4 "simulating spiking neural p systems without delays using gpus. present paper work regarding simulating type p system known spiking neural p system (snp system) using graphics processing units (gpus). gpus, architectural optimization parallel computations, well-suited highly parallelizable problems. due advent general purpose gpu computing recent years, gpus limited graphics video processing alone, include computationally intensive scientific mathematical applications well. moreover p systems, including snp systems, inherently maximally parallel computing models whose inspirations taken functioning dynamics living cell. particular, snp systems try give modest formal representation special type cell known neuron interactions one another. nature snp systems allowed representation matrices, crucial step simulating highly parallel devices gpus. highly parallel nature snp systems necessitate use hardware intended parallel computations. simulation algorithms, design considerations, implementation presented. finally, simulation results, observations, analyses using snp system generates numbers $\mathbb n$ - {1} discussed, well recommendations future work.",4 "distributed algorithm training nonlinear kernel machines. paper concerns distributed training nonlinear kernel machines map-reduce. show re-formulation nystr\""om approximation based solution solved using gradient based techniques well suited this, especially necessary work large number basis points. main advantages approach are: avoidance computing pseudo-inverse kernel sub-matrix corresponding basis points; simplicity efficiency distributed part computations; and, friendliness stage-wise addition basis points. implement method using allreduce tree hadoop demonstrate value large benchmark datasets.",4 "analysis random algorithm estimating matchings. counting number matchings bipartite graph transformed calculating permanent matrix obtained extended bipartite graph yan huo, rasmussen presents simple approach (rm) approximate permanent, yields critical ratio o($n\omega(n)$) almost 0-1 matrices, provided simple promising practical way compute #p-complete problem. paper, performance method shown applied compute matchings based transformation. critical ratio proved large certain probability, owning increasing factor larger polynomial $n$ even sense almost 0-1 matrices. hence, rm fails work well counting matchings via computing permanent matrix. words, must carefully utilize known methods estimating permanent count matchings transformation.",4 "incremental construction minimal acyclic finite-state automata. paper, describe new method constructing minimal, deterministic, acyclic finite-state automata set strings. traditional methods consist two phases: first construct trie, second one minimize it. approach construct minimal automaton single phase adding new strings one one minimizing resulting automaton on-the-fly. present general algorithm well specialization relies upon lexicographical ordering input strings.",4 "constraints, lazy constraints, propagators asp solving: empirical analysis. answer set programming (asp) well-established declarative paradigm. one successes asp availability efficient systems. state-of-the-art systems based ground+solve approach. applications approach infeasible grounding one constraints expensive. paper, systematically compare alternative strategies avoid instantiation problematic constraints, based custom extensions solver. results real synthetic benchmarks highlight strengths weaknesses different strategies. (under consideration acceptance tplp, iclp 2017 special issue.)",4 "building rules top ontologies semantic web inductive logic programming. building rules top ontologies ultimate goal logical layer semantic web. aim ad-hoc mark-up language layer currently discussion. intended follow tradition hybrid knowledge representation reasoning systems $\mathcal{al}$-log integrates description logic $\mathcal{alc}$ function-free horn clausal language \textsc{datalog}. paper consider problem automating acquisition rules semantic web. propose general framework rule induction adopts methodological apparatus inductive logic programming relies expressive deductive power $\mathcal{al}$-log. framework valid whatever scope induction (description vs. prediction) is. yet, illustrative purposes, also discuss instantiation framework aims description turns useful ontology refinement. keywords: inductive logic programming, hybrid knowledge representation reasoning systems, ontologies, semantic web. note: appear theory practice logic programming (tplp)",4 "lost time: temporal analytics long-term video surveillance. video surveillance well researched area study substantial work done aspects object detection, tracking behavior analysis. abundance video data captured long period time, understand patterns human behavior scene dynamics data-driven temporal analytics. work, propose two schemes perform descriptive predictive analytics long-term video surveillance data. generate heatmap footmap visualizations describe spatially pooled trajectory patterns respect time location. also present two approaches anomaly prediction day-level granularity: trajectory-based statistical approach, time-series based approach. experimentation one year data single camera demonstrates ability uncover interesting insights scene predict anomalies reasonably well.",4 "surveillance video parsing single frame supervision. surveillance video parsing, segments video frames several labels, e.g., face, pants, left-leg, wide applications. however,pixel-wisely annotating frames tedious inefficient. paper, develop single frame video parsing (svp) method requires one labeled frame per video training stage. parse one particular frame, video segment preceding frame jointly considered. svp (1) roughly parses frames within video segment, (2) estimates optical flow frames (3) fuses rough parsing results warped optical flow produce refined parsing result. three components svp, namely frame parsing, optical flow estimation temporal fusion integrated end-to-end manner. experimental results two surveillance video datasets show superiority svp state-of-the-arts.",4 "solving problem k parameter knn classifier using ensemble learning approach. paper presents new solution choosing k parameter k-nearest neighbor (knn) algorithm, solution depending idea ensemble learning, weak knn classifier used time different k, starting one square root size training set. results weak classifiers combined using weighted sum rule. proposed solution tested compared solutions using group experiments real life problems. experimental results show proposed classifier outperforms traditional knn classifier uses different number neighbors, competitive classifiers, promising classifier strong potential wide range applications.",4 "video genome. fast evolution internet technologies led explosive growth video data available public domain created unprecedented challenges analysis, organization, management, control content. problems encountered video analysis identifying video large database (e.g. detecting pirated content youtube), putting together video fragments, finding similarities common ancestry different versions video, analogous counterpart problems genetic research analysis dna protein sequences. paper, exploit analogy genetic sequences videos propose approach video analysis motivated genomic research. representing video information video dna sequences applying bioinformatic algorithms allows search, match, compare videos large-scale databases. show application content-based metadata mapping versions annotated video.",4 "pruning variable selection ensembles. context variable selection, ensemble learning gained increasing interest due great potential improve selection accuracy reduce false discovery rate. novel ordering-based selective ensemble learning strategy designed paper obtain smaller accurate ensembles. particular, greedy sorting strategy proposed rearrange order members included integration process. stopping fusion process early, smaller subensemble higher selection accuracy obtained. importantly, sequential inclusion criterion reveals fundamental strength-diversity trade-off among ensemble members. taking stability selection (abbreviated stabsel) example, experiments conducted simulated real-world data examine performance novel algorithm. experimental results demonstrate pruned stabsel generally achieves higher selection accuracy lower false discovery rates stabsel several benchmark methods.",19 "theory optimizing pseudolinear performance measures: application f-measure. non-linear performance measures widely used evaluation learning algorithms. example, $f$-measure commonly used performance measure classification problems machine learning information retrieval community. study theoretical properties subset non-linear performance measures called pseudo-linear performance measures includes $f$-measure, \emph{jaccard index}, among many others. establish many notions $f$-measures \emph{jaccard index} pseudo-linear functions per-class false negatives false positives binary, multiclass multilabel classification. based observation, present general reduction performance measure optimization problem cost-sensitive classification problem unknown costs. propose algorithm provable guarantees obtain approximately optimal classifier $f$-measure solving series cost-sensitive classification problems. strength analysis valid dataset class classifiers, extending existing theoretical results pseudo-linear measures, asymptotic nature. also establish multi-objective nature $f$-score maximization problem linking algorithm weighted-sum approach used multi-objective optimization. present numerical experiments illustrate relative importance cost asymmetry thresholding learning linear classifiers various $f$-measure optimization tasks.",4 "arguments effectiveness human problem solving. question humans solve problem addressed extensively. however, direct study effectiveness process seems overlooked. paper, address issue effectiveness human problem solving: analyze effectiveness comes cognitive mechanisms heuristics involved. results based optimal probabilistic problem solving strategy appeared solomonoff paper general problem solving system. provide arguments certain set cognitive mechanisms heuristics drive human problem solving similar manner optimal solomonoff strategy. results presented paper serve cognitive psychology better understanding human problem solving processes well artificial intelligence designing human-like agents.",4 "infinite hierarchical factor regression model. propose nonparametric bayesian factor regression model accounts uncertainty number factors, relationship factors. accomplish this, propose sparse variant indian buffet process couple hierarchical model factors, based kingman's coalescent. apply model two problems (factor analysis factor regression) gene-expression data analysis.",4 "plausibility probability?(revised 2003, 2015). present examine result related uncertainty reasoning, namely certain plausibility space cox's type uniquely embedded minimal ordered field. this, although purely mathematical result, claimed imply every rational method reason uncertainty must based sets extended probability distributions, extended probability standard probability extended infinitesimals. claim must supported argumentation non-mathematical type, however, since pure mathematics tell us anything world. propose one argumentation, relate results literature uncertainty statistics. added retrospective section discuss developments area regarding countable additivity, partially ordered domains robustness, philosophical stances cox/jaynes approach since 2003. also show general partially ordered plausibility calculus embeddable ring represented set extended probability distributions or, algebraic terms, subdirect sum ordered fields. words, robust bayesian approach universal. result exemplified relating dempster-shafer's evidence theory robust bayesian analysis.",4 "speaker diarization lstm. many years, i-vector based audio embedding techniques dominant approach speaker verification speaker diarization applications. however, mirroring rise deep learning various domains, neural network based audio embeddings, also known d-vectors, consistently demonstrated superior speaker verification performance. paper, build success d-vector based speaker verification systems develop new d-vector based approach speaker diarization. specifically, combine lstm-based d-vector audio embeddings recent work non-parametric clustering obtain state-of-the-art speaker diarization system. system evaluated three standard public datasets, suggesting d-vector based diarization systems offer significant advantages traditional i-vector based systems. achieved 12.0% diarization error rate nist sre 2000 callhome, model trained out-of-domain data voice search logs.",6 "design analysis multiple view descriptors. propose extension popular descriptors based gradient orientation histograms (hog, computed single image) multiple views. hinges interpreting hog conditional density space sampled images, effects nuisance factors viewpoint illumination marginalized. however, marginalization performed respect coarse approximation underlying distribution. extension leverages fact multiple views scene allow separating intrinsic nuisance variability, thus afford better marginalization latter. result descriptor complexity single-view hog, compared manner, exploits multiple views better trade insensitivity nuisance variability specificity intrinsic variability. also introduce novel multi-view wide-baseline matching dataset, consisting mixture real synthetic objects ground truthed camera motion dense three-dimensional geometry.",4 "context-sensitive super-resolution fast fetal magnetic resonance imaging. 3d magnetic resonance imaging (mri) often trade-off fast low-resolution image acquisition highly detailed slow image acquisition. fast imaging required targets move avoid motion artefacts. particular difficult fetal mri. spatially independent upsampling techniques, state-of-the-art address problem, error prone disregard contextual information. paper propose context-sensitive upsampling method based residual convolutional neural network model learns organ specific appearance adopts semantically input data allowing generation high resolution images sharp edges fine scale detail. making contextual decisions appearance shape, present different parts image, gain maximum structural detail similar contrast provided high-resolution data. experiment $145$ fetal scans show approach yields increased psnr $1.25$ $db$ applied under-sampled fetal data \emph{cf.} baseline upsampling. furthermore, method yields increased psnr $1.73$ $db$ utilizing under-sampled fetal data perform brain volume reconstruction motion corrupted captured data.",4 "learning hierarchical information flow recurrent neural modules. propose thalnet, deep learning model inspired neocortical communication via thalamus. model consists recurrent neural modules send features routing center, endowing modules flexibility share features multiple time steps. show model learns route information hierarchically, processing input data chain modules. observe common architectures, feed forward neural networks skip connections, emerging special cases architecture, novel connectivity patterns learned text8 compression task. model outperforms standard recurrent neural networks several sequential benchmarks.",4 "image captioning classification dangerous situations. current robot platforms employed collaborate humans wide range domestic industrial tasks. environments require autonomous systems able classify communicate anomalous situations fires, injured persons, car accidents; generally, potentially dangerous situation humans. paper introduce anomaly detection dataset purpose robot applications well design implementation deep learning architecture classifies describes dangerous situations using single image input. report classification accuracy 97 % meteor score 16.2. make dataset publicly available paper accepted.",4 "real-coded chemical reaction optimization different perturbation functions. chemical reaction optimization (cro) powerful metaheuristic mimics interactions molecules chemical reactions search global optimum. perturbation function greatly influences performance cro solving different continuous problems. paper, study four different probability distributions, namely, gaussian distribution, cauchy distribution, exponential distribution, modified rayleigh distribution, perturbation function cro. different distributions different impacts solutions. distributions tested set well-known benchmark functions simulation results show problems different characteristics different preference distribution function. study gives guidelines design cro different types optimization problems.",4 "local neighborhood intensity pattern: new texture feature descriptor image retrieval. paper, new texture descriptor based local neighborhood intensity difference proposed content based image retrieval (cbir). computation texture features like local binary pattern (lbp), center pixel 3*3 window image compared remaining neighbors, one pixel time generate binary bit pattern. ignores effect adjacent neighbors particular pixel binary encoding also texture description. proposed method based concept neighbors particular pixel hold significant amount texture information considered efficient texture representation cbir. taking account, develop new texture descriptor, named local neighborhood intensity pattern (lnip) considers relative intensity difference particular pixel center pixel considering adjacent neighbors generate sign magnitude pattern. since sign magnitude patterns hold complementary information other, two patterns concatenated single feature descriptor generate concrete useful feature descriptor. proposed descriptor tested image retrieval four databases, including three texture image databases - brodatz texture image database, mit vistex database salzburg texture database one face database at&t face database. precision recall values observed databases compared state-of-art local patterns. proposed method showed significant improvement many existing methods.",4 "admm-based networked stochastic variational inference. owing recent advances ""big data"" modeling prediction tasks, variational bayesian estimation gained popularity due ability provide exact solutions approximate posteriors. one key technique approximate inference stochastic variational inference (svi). svi poses variational inference stochastic optimization problem solves iteratively using noisy gradient estimates. aims handle massive data predictive classification tasks applying complex bayesian models observed well latent variables. paper aims decentralize allowing parallel computation, secure learning robustness benefits. use alternating direction method multipliers top-down setting develop distributed svi algorithm independent learners running inference algorithms require sharing estimated model parameters instead private datasets. work extends distributed svi-admm algorithm first propose, admm-based networked svi algorithm learners working distributively share information according rules graph form network. kind work lies umbrella `deep learning networks' verify algorithm topic-modeling problem corpus wikipedia articles. illustrate results latent dirichlet allocation (lda) topic model large document classification, compare performance centralized algorithm, use numerical experiments corroborate analytical results.",4 "quality priority ratios estimation relation selected prioritization procedure consistency measure pairwise comparison matrix. overview current debates contemporary research devoted modeling decision making processes facilitation directs attention analytic hierarchy process (ahp). core ahp various prioritization procedures (pps) consistency measures (cms) pairwise comparison matrix (pcm) which, sense, reflects preferences decision makers. certainly, judgments preferences perfectly consistent (cardinally transitive), pps coincide quality priority ratios (prs) estimation exemplary. however, human judgments rarely consistent, thus quality prs estimation may significantly vary. scale variations depends applied pp utilized cm pcm. important find pps cms pcm lead directly improvement prs estimation accuracy. main goal research realized properly designed, coded executed seminal sophisticated simulation algorithms wolfram mathematica 8.0. research results convince embedded ahp commonly applied, genuine pp cm pcm may significantly deteriorate quality prs estimation; however, solutions proposed paper significantly improve methodology.",4 "anatomy search mining system digital archives. samtla (search mining tools linguistic analysis) digital humanities system designed collaboration historians linguists assist research work quantifying content textual corpora approximate phrase search document comparison. retrieval engine uses character-based n-gram language model rather conventional word-based one achieve great flexibility language agnostic query processing. index implemented space-optimised character-based suffix tree accompanying database document content metadata. number text mining tools integrated system allow researchers discover textual patterns, perform comparative analysis, find currently popular research community. herein describe system architecture, user interface, models algorithms, data storage samtla system. also present several case studies usage practice together evaluation systems' ranking performance crowdsourcing.",4 "fast learning rate deep learning via kernel perspective. develop new theoretical framework analyze generalization error deep learning, derive new fast learning rate two representative algorithms: empirical risk minimization bayesian deep learning. series theoretical analyses deep learning revealed high expressive power universal approximation capability. although analyses highly nonparametric, existing generalization error analyses developed mainly fixed dimensional parametric model. compensate gap, develop infinite dimensional model based integral form performed analysis universal approximation capability. allows us define reproducing kernel hilbert space corresponding layer. point view deal ordinary finite dimensional deep neural network finite approximation infinite dimensional one. approximation error evaluated degree freedom reproducing kernel hilbert space layer. estimate good finite dimensional model, consider empirical risk minimization bayesian deep learning. derive generalization error bound shown appears bias-variance trade-off terms number parameters finite dimensional approximation. show optimal width internal layers determined degree freedom convergence rate faster $o(1/\sqrt{n})$ rate shown existing studies.",12 "automatic vertebra labeling large-scale 3d ct using deep image-to-image network message passing sparsity regularization. automatic localization labeling vertebra 3d medical images plays important role many clinical tasks, including pathological diagnosis, surgical planning postoperative assessment. however, unusual conditions pathological cases, abnormal spine curvature, bright visual imaging artifacts caused metal implants, limited field view, increase difficulties accurate localization. paper, propose automatic fast algorithm localize label vertebra centroids 3d ct volumes. first, deploy deep image-to-image network (di2in) initialize vertebra locations, employing convolutional encoder-decoder architecture together multi-level feature concatenation deep supervision. next, centroid probability maps di2in iteratively evolved message passing schemes based mutual relation vertebra centroids. finally, localization results refined sparsity regularization. proposed method evaluated public dataset 302 spine ct volumes various pathologies. method outperforms state-of-the-art methods terms localization accuracy. run time around 3 seconds average per case. boost performance, retrain di2in additional 1000+ 3d ct volumes different patients. best knowledge, first time 1000 3d ct volumes expert annotation adopted experiments anatomic landmark detection tasks. experimental results show training large dataset significantly improves performance overall identification rate, first time knowledge, reaches 90 %.",4 "training restricted boltzmann machines word observations. restricted boltzmann machine (rbm) flexible tool modeling complex data, however significant computational difficulties using rbms model high-dimensional multinomial observations. natural language processing applications, words naturally modeled k-ary discrete distributions, k determined vocabulary size easily hundreds thousands. conventional approach training rbms word observations limited requires sampling states k-way softmax visible units block gibbs updates, operation takes time linear k. work, address issue employing general class markov chain monte carlo operators visible units, yielding updates computational complexity independent k. demonstrate success approach training rbms hundreds millions word n-grams using larger vocabularies previously feasible using learned features improve performance chunking sentiment classification tasks, achieving state-of-the-art results latter.",4 "m3: scaling machine learning via memory mapping. process data fit ram, conventional wisdom would suggest using distributed approaches. however, recent research demonstrated virtual memory's strong potential scaling graph mining algorithms single machine. propose use similar approach general machine learning. contribute: (1) latest finding memory mapping also feasible technique scaling general machine learning algorithms like logistic regression k-means, data fits exceeds ram (we tested datasets 190gb); (2) approach, called m3, enables existing machine learning algorithms work out-of-core datasets memory mapping, achieving speed significantly faster 4-instance spark cluster, comparable 8-instance cluster.",4 "information assisted dictionary learning fmri data analysis. extracting information functional magnetic resonance images (fmri) major area research many years, still demanding accurate techniques. nowadays, plenty available information brain-behavior used develop precise methods. thus, paper presents new dictionary learning method allows incorporating external information regarding studied problem, novel sets constraints. finally, apply proposed method synthetic fmri data, several tests show improvement performance compared common techniques.",19 "use skeletons learning bayesian networks. paper, present heuristic operator aims simultaneously optimizing orientations edges intermediate bayesian network structure search process. done alternating space directed acyclic graphs (dags) space skeletons. found orientations edges based scoring function rather induced conditional independences. operator used extension commonly employed search strategies. evaluated experiments artificial real-world data.",4 additive non-negative matrix factorization missing data. non-negative matrix factorization (nmf) previously shown useful decomposition multivariate data. interpret factorization new way use generate missing attributes test data. provide joint optimization scheme missing attributes well nmf factors. prove monotonic convergence algorithms. present classification results cases missing attributes.,4 "visual decoding targets visual search human eye fixations. human gaze reveal users' intents extend intents inferred even visualized? gaze proposed implicit source information predict target visual search and, recently, predict object class attributes search target. work, go one step investigate feasibility combining recent advances encoding human gaze information using deep convolutional neural networks power generative image models visually decode, i.e. create visual representation of, search target. visual decoding challenging two reasons: 1) search target resides user's mind subjective visual pattern, often even described verbally person, 2) is, yet, unclear gaze fixations contain sufficient information task all. show, first time, visual representations search targets indeed decoded human gaze fixations. propose first encode fixations semantic representation decode representation image. evaluate method recent gaze dataset 14 participants searching clothing image collages validate model's predictions using two human studies. results show 62% (chance level = 10%) time users able select categories decoded image right. second studies show importance local gaze encoding decoding visual search targets user",4 "genetic algorithm approach solving flexible job shop scheduling problem. flexible job shop scheduling noticed effective manufacturing system cope rapid development today's competitive environment. flexible job shop scheduling problem (fjssp) known np-hard problem field optimization. considering dynamic state real world makes problem complicated. studies field fjssp focused minimizing total makespan. paper, mathematical model fjssp developed. objective function maximizing total profit meeting constraints. time-varying raw material costs selling prices dissimilar demands period, considered decrease gaps reality model. manufacturer produces various parts gas valves used case study. scheduling problem multi-part, multi-period, multi-operation parallel machines solved using genetic algorithm (ga). best obtained answer determines economic amount production different machines belong predefined operations part satisfy customer demand period.",12 "monte carlo tree search sampled information relaxation dual bounds. monte carlo tree search (mcts), famously used game-play artificial intelligence (e.g., game go), well-known strategy constructing approximate solutions sequential decision problems. primary innovation use heuristic, known default policy, obtain monte carlo estimates downstream values states decision tree. information used iteratively expand tree towards regions states actions optimal policy might visit. however, guarantee convergence optimal action, mcts requires entire tree expanded asymptotically. paper, propose new technique called primal-dual mcts utilizes sampled information relaxation upper bounds potential actions, creating possibility ""ignoring"" parts tree stem highly suboptimal choices. allows us prove despite converging partial decision tree limit, recommended action primal-dual mcts optimal. new approach shows significant promise used optimize behavior single driver navigating graph operating ride-sharing platform. numerical experiments real dataset 7,000 trips new jersey suggest primal-dual mcts improves upon standard mcts producing deeper decision trees exhibits reduced sensitivity size action space.",12 "surprise: youve got explaining. events surprising others? propose events difficult explain surprising. two experiments reported test impact different event outcomes (outcome-type) task demands (task) ratings surprise simple story scenarios. outcome-type variable, participants saw outcomes either known less-known surprising outcomes scenario. task variable, participants either answered comprehension questions provided explanation outcome. outcome-type reliably affected surprise judgments; known outcomes rated less surprising less-known outcomes. task also reliably affected surprise judgments; people provided explanation lowered surprise judgments relative simply answering comprehension questions. experiments thus provide evidence less-explored explanation aspect surprise, specifically showing ease explanation key factor determining level surprise experienced.",4 "analysis parallelized motion masking using dual-mode single gaussian models. motion detection video important number applications fields. video surveillance, motion detection essential accompaniment activity recognition early warning systems. robotics also much gain motion detection segmentation, particularly high speed motion tracking tactile systems. myriad techniques detecting masking motion image. successful systems used gaussian models discern background foreground image (motion static imagery). however, particularly case moving camera frame reference, necessary compensate motion camera attempting discern objects moving foreground. example, possible estimate motion camera optical flow methods temporal differencing compensate motion background subtraction model. selection method yi et al. using dual-mode single gaussian models this. implement technique intel's thread building blocks (tbb) nvidia's cuda libraries. compare parallelization improvements theoretical analysis speedups based characteristics selected model attributes tbb cuda. make implementation available public.",4 "museum exhibit identification challenge domain adaptation beyond. paper, approach open problem artwork identification propose new dataset dubbed open museum identification challenge (open mic). contains photos exhibits captured 10 distinct exhibition spaces several museums showcase paintings, timepieces, sculptures, glassware, relics, science exhibits, natural history pieces, ceramics, pottery, tools indigenous crafts. goal open mic stimulate research domain adaptation, egocentric recognition few-shot learning providing testbed complementary famous office dataset reaches 90% accuracy. form dataset, captured number images per art piece mobile phone wearable cameras form source target data splits, respectively. achieve robust baselines, build recent approach aligns per-class scatter matrices source target cnn streams [15]. moreover, exploit positive definite nature representations using end-to-end bregman divergences riemannian metric. present baselines training/evaluation per exhibition training/evaluation combined set covering 866 exhibit identities. exhibition poses distinct challenges e.g., quality lighting, motion blur, occlusions, clutter, viewpoint scale variations, rotations, glares, transparency, non-planarity, clipping, break results w.r.t. factors.",4 "characteristic kernels infinitely divisible distributions. connect shift-invariant characteristic kernels infinitely divisible distributions $\mathbb{r}^{d}$. characteristic kernels play important role machine learning applications kernel means distinguish two probability measures. contribution paper two-fold. first, show, using l\'evy-khintchine formula, shift-invariant kernel given bounded, continuous symmetric probability density function (pdf) infinitely divisible distribution $\mathbb{r}^d$ characteristic. also present closure property characteristic kernels addition, pointwise product, convolution. second, developing various kernel mean algorithms, fundamental compute following values: (i) kernel mean values $m_p(x)$, $x \in \mathcal{x}$, (ii) kernel mean rkhs inner products ${\left\langle m_p, m_q \right\rangle_{\mathcal{h}}}$, probability measures $p, q$. $p, q$, kernel $k$ gaussians, computation (i) (ii) results gaussian pdfs tractable. generalize gaussian combination general cases class infinitely divisible distributions. introduce {\it conjugate} kernel {\it convolution trick}, (i) (ii) pdf form, expecting tractable computation least cases. specific instances, explore $\alpha$-stable distributions rich class generalized hyperbolic distributions, laplace, cauchy student-t distributions included.",19 "adapting deep visuomotor representations weak pairwise constraints. real-world robotics problems often occur domains differ significantly robot's prior training environment. many robotic control tasks, real world experience expensive obtain, data easy collect either instrumented environment simulation. propose novel domain adaptation approach robot perception adapts visual representations learned large easy-to-obtain source dataset (e.g. synthetic images) target real-world domain, without requiring expensive manual data annotation real world data policy search. supervised domain adaptation methods minimize cross-domain differences using pairs aligned images contain object scene source target domains, thus learning domain-invariant representation. however, require manual alignment image pairs. fully unsupervised adaptation methods rely minimizing discrepancy feature distributions across domains. propose novel, powerful combination distribution pairwise image alignment, remove requirement expensive annotation using weakly aligned pairs images source target domains. focusing adapting simulation real world data using pr2 robot, evaluate approach manipulation task show using weakly paired images, method compensates domain shift effectively previous techniques, enabling better robot performance real world.",4 "model selection nonlinear embedding unsupervised domain adaptation. domain adaptation deals adapting classifiers trained data source distribution, work effectively data target distribution. paper, introduce nonlinear embedding transform (net) unsupervised domain adaptation. net reduces cross-domain disparity nonlinear domain alignment. also embeds domain-aligned data similar data points clustered together. results enhanced classification. determine parameters net model (and unsupervised domain adaptation models), introduce validation procedure sampling source data points similar distribution target data. test net validation procedure using popular image datasets compare classification results across competitive procedures unsupervised domain adaptation.",4 "improvement/extension modular systems combinatorial reengineering (survey). paper describes development (improvement/extension) approaches composite (modular) systems (as combinatorial reengineering). following system improvement/extension actions considered: (a) improvement systems component(s) (e.g., improvement system component, replacement system component); (b) improvement system component interconnection (compatibility); (c) joint improvement improvement system components(s) interconnection; (d) improvement system structure (replacement system part(s), addition system part, deletion system part, modification system structure). study system improvement approaches involve crucial issues: (i) scales evaluation system components component compatibility (quantitative scale, ordinal scale, poset-like scale, scale based interval multiset estimate), (ii) evaluation integrated system quality, (iii) integration methods obtain integrated system quality. system improvement/extension strategies examined seleciton/combination improvement action(s) modification system structure. strategies based combinatorial optimization problems (e.g., multicriteria selection, knapsack problem, multiple choice problem, combinatorial synthesis based morphological clique problem, assignment/reassignment problem, graph recoloring problem, spanning problems, hotlink assignment). here, heuristics used. various system improvement/extension strategies presented including illustrative numerical examples.",4 "learning efficient image representation person re-identification. color names based image representation successfully used person re-identification, due advantages compact, intuitively understandable well robust photometric variance. however, exists diversity underlying distribution color names' rgb values image pixels' rgb values, may lead inaccuracy directly comparing euclidean space. paper, propose new method named soft gaussian mapping (sgm) address problem. model discrepancies color names pixels using gaussian utilize inverse covariance matrix bridge gap them. based sgm, image could converted several soft gaussian maps. soft gaussian map, seek establish stable robust descriptors within local region max pooling operation. then, robust image representation based color names obtained concatenating statistical descriptors stripe. labeled data available, one discriminative subspace projection matrix learned build efficient representations image via cross-view coupling learning. experiments public datasets - viper, prid450s cuhk03, demonstrate effectiveness method.",4 "benchmarking decoupled neural interfaces synthetic gradients. artifical neural networks particular class learning systems modeled biological neural functions interesting penchant hebbian learning, ""neurons wire together, fire together"". however, unlike natural counterparts, artificial neural networks close stringent coupling modules neurons network. coupling locking imposes upon network strict inflexible structure prevent layers network updating weights full feed-forward backward pass occurred. constraint though may sufficed while, longer feasible era very-large-scale machine learning, coupled increased desire parallelization learning process across multiple computing infrastructures. solve problem, synthetic gradients (sg) decoupled neural interfaces (dni) introduced viable alternative backpropagation algorithm. paper performs speed benchmark compare speed accuracy capabilities sg-dni opposed standard neural interface using multilayer perceptron mlp. sg-dni shows good promise, captures learning problem, also 3-fold faster due asynchronous learning capabilities.",4 "superpixel based segmentation classification polyps wireless capsule endoscopy. wireless capsule endoscopy (wce) relatively new technology record entire gi trace, vivo. large amounts frames captured examination cause difficulties physicians review frames. need reducing reviewing time using intelligent methods challenge. polyps considered growing tissues surface intestinal tract inside organ. polyps cancerous, one becomes larger centimeter, turn cancer great chance. wce frames provide early stage possibility detection polyps. here, application simple linear iterative clustering (slic) superpixel segmentation polyps wce frames evaluated. different slic superpixel numbers examined find highest sensitivity detection polyps. slic superpixel segmentation promising improve results previous studies. finally, superpixels classified using support vector machine (svm) extracting texture color features. classification results showed sensitivity 91%.",4 "semi-supervised cross-entropy clustering information bottleneck constraint. paper, propose semi-supervised clustering method, cec-ib, models data set gaussian distributions retrieves clusters based partial labeling provided user (partition-level side information). combining ideas cross-entropy clustering (cec) information bottleneck method (ib), method trades three conflicting goals: accuracy data set modeled, simplicity model, consistency clustering side information. experiments demonstrate cec-ib performance comparable gaussian mixture models (gmm) classical semi-supervised scenario, faster, robust noisy labels, automatically determines optimal number clusters, performs well classes present side information. moreover, contrast semi-supervised models, successfully applied discovering natural subgroups partition-level side information derived top levels hierarchical clustering.",4 "histogram oriented principal components cross-view action recognition. existing techniques 3d action recognition sensitive viewpoint variations extract features depth images viewpoint dependent. contrast, directly process pointclouds cross-view action recognition unknown unseen views. propose histogram oriented principal components (hopc) descriptor robust noise, viewpoint, scale action speed variations. 3d point, hopc computed projecting three scaled eigenvectors pointcloud within local spatio-temporal support volume onto vertices regular dodecahedron. hopc also used detection spatio-temporal keypoints (stk) 3d pointcloud sequences view-invariant stk descriptors (or local hopc descriptors) key locations used action recognition. also propose global descriptor computed normalized spatio-temporal distribution stks 4-d, refer stk-d. evaluated performance proposed descriptors nine existing techniques two cross-view three single-view human action recognition datasets. experimental results show techniques provide significant improvement state-of-the-art methods.",4 "ruber: unsupervised method automatic evaluation open-domain dialog systems. open-domain human-computer conversation attracting increasing attention past years. however, exist standard automatic evaluation metric open-domain dialog systems; researchers usually resort human annotation model evaluation, time- labor-intensive. paper, propose ruber, referenced metric unreferenced metric blended evaluation routine, evaluates reply taking consideration groundtruth reply query (previous user-issued utterance). metric learnable, training require labels human satisfaction. hence, ruber flexible extensible different datasets languages. experiments retrieval generative dialog systems show ruber high correlation human annotation.",4 "solving factored mdps hybrid state action variables. efficient representations solutions large decision problems continuous discrete variables among important challenges faced designers automated decision support systems. paper, describe novel hybrid factored markov decision process (mdp) model allows compact representation problems, new hybrid approximate linear programming (halp) framework permits efficient solutions. central idea halp approximate optimal value function linear combination basis functions optimize weights linear programming. analyze theoretical computational aspects approach, demonstrate scale-up potential several hybrid optimization problems.",4 "global analysis expectation maximization mixtures two gaussians. expectation maximization (em) among popular algorithms estimating parameters statistical models. however, em, iterative algorithm based maximum likelihood principle, generally guaranteed find stationary points likelihood objective, points may far maximizer. article addresses disconnect statistical principles behind em algorithmic properties. specifically, provides global analysis em specific models observations comprise i.i.d. sample mixture two gaussians. achieved (i) studying sequence parameters idealized execution em infinite sample limit, fully characterizing limit points sequence terms initial parameters; (ii) based convergence analysis, establishing statistical consistency (or lack thereof) actual sequence parameters produced em.",12 "least angle $\ell_1$ penalized regression: review. least angle regression promising technique variable selection applications, offering nice alternative stepwise regression. provides explanation similar behavior lasso ($\ell_1$-penalized regression) forward stagewise regression, provides fast implementation both. idea caught rapidly, sparked great deal research interest. paper, give overview least angle regression current state related research.",19 "model selection topic models via spectral decomposition. topic models achieved significant successes analyzing large-scale text corpus. practical applications, always confronted challenge model selection, i.e., appropriately set number topics. following recent advances topic model inference via tensor decomposition, make first attempt provide theoretical analysis model selection latent dirichlet allocation. mild conditions, derive upper bound lower bound number topics given text collection finite size. experimental results demonstrate bounds accurate tight. furthermore, using gaussian mixture model example, show methodology easily generalized model selection analysis latent models.",19 "synkhronos: multi-gpu theano extension data parallelism. present synkhronos, extension theano multi-gpu computations leveraging data parallelism. framework provides automated execution synchronization across devices, allowing users continue write serial programs without risk race conditions. nvidia collective communication library used high-bandwidth inter-gpu communication. enhancements theano function interface include input slicing (with aggregation) input indexing, perform common data-parallel computation patterns efficiently. one example use case synchronous sgd, recently shown scale well growing set deep learning problems. training resnet-50, achieve near-linear speedup 7.5x nvidia dgx-1 using 8 gpus, relative theano-only code running single gpu isolation. yet synkhronos remains general data-parallel computation programmable theano. implementing parallelism level individual theano functions, framework uniquely addresses niche manual multi-device programming prescribed multi-gpu training routines.",4 "deepskeleton: learning multi-task scale-associated deep side outputs object skeleton extraction natural images. object skeletons useful object representation object detection. complementary object contour, provide extra information, object scale (thickness) varies among object parts. object skeleton extraction natural images challenging, requires extractor able capture local non-local image context order determine scale skeleton pixel. paper, present novel fully convolutional network multiple scale-associated side outputs address problem. observing relationship receptive field sizes different layers network skeleton scales capture, introduce two scale-associated side outputs stage network. network trained multi-task learning, one task skeleton localization classify whether pixel skeleton pixel not, skeleton scale prediction regress scale skeleton pixel. supervision imposed different stages guiding scale-associated side outputs toward groundtruth skeletons appropriate scales. responses multiple scale-associated side outputs fused scale-specific way detect skeleton pixels using multiple scales effectively. method achieves promising results two skeleton extraction datasets, significantly outperforms competitors. additionally, usefulness obtained skeletons scales (thickness) verified two object detection applications: foreground object segmentation object proposal detection.",4 "particle approximations score observed information matrix parameter estimation state space models linear computational cost. poyiadjis et al. (2011) show particle methods used estimate score observed information matrix state space models. methods either suffer computational cost quadratic number particles, produce estimates whose variance increases quadratically amount data. paper introduces alternative approach estimating terms computational cost linear number particles. method derived using combination kernel density estimation, avoid particle degeneracy causes quadratically increasing variance, rao-blackwellisation. crucially, show method robust choice bandwidth within kernel density estimation, good asymptotic properties regardless choice. estimates score observed information matrix used within online batch procedures estimating parameters state space models. empirical results show improved parameter estimates compared existing methods significantly reduced computational cost. supplementary materials including code available.",19 "postprocessing compressed images via sequential denoising. work propose novel postprocessing technique compression-artifact reduction. approach based posing task inverse problem, regularization leverages existing state-of-the-art image denoising algorithms. rely recently proposed plug-and-play prior framework, suggesting solution general inverse problems via alternating direction method multipliers (admm), leading sequence gaussian denoising steps. key feature scheme linearization compression-decompression process, get formulation optimized. addition, supply thorough analysis linear approximation several basic compression procedures. proposed method suitable diverse compression techniques rely transform coding. specifically, demonstrate impressive gains image quality several leading compression methods - jpeg, jpeg2000, hevc.",4 "simultaneous traffic sign detection boundary estimation using convolutional neural network. propose novel traffic sign detection system simultaneously estimates location precise boundary traffic signs using convolutional neural network (cnn). estimating precise boundary traffic signs important navigation systems intelligent vehicles traffic signs used 3d landmarks road environment. previous traffic sign detection systems, including recent methods based cnn, provide bounding boxes traffic signs output, thus requires additional processes contour estimation image segmentation obtain precise sign boundary. work, boundary estimation traffic signs formulated 2d pose shape class prediction problem, effectively solved single cnn. predicted 2d pose shape class target traffic sign input image, estimate actual boundary target sign projecting boundary corresponding template sign image input image plane. formulating boundary estimation problem cnn-based pose shape prediction task, method end-to-end trainable, robust occlusion small targets boundary estimation methods rely contour estimation image segmentation. proposed method architectural optimization provides accurate traffic sign boundary estimation also efficient compute, showing detection frame rate higher 7 frames per second low-power mobile platforms.",4 "novel feature extraction, selection fusion effective malware family classification. modern malware designed mutation characteristics, namely polymorphism metamorphism, causes enormous growth number variants malware samples. categorization malware samples basis behaviors essential computer security community, receive huge number malware everyday, signature extraction process usually based malicious parts characterizing malware families. microsoft released malware classification challenge 2015 huge dataset near 0.5 terabytes data, containing 20k malware samples. analysis dataset inspired development novel paradigm effective categorizing malware variants actual family groups. paradigm presented discussed present paper, emphasis given phases related extraction, selection set novel features effective representation malware samples. features grouped according different characteristics malware behavior, fusion performed according per-class weighting paradigm. proposed method achieved high accuracy ($\approx$ 0.998) microsoft malware challenge dataset.",4 "stochastic multi-armed bandits constant space. consider stochastic bandit problem sublinear space setting, one cannot record win-loss record $k$ arms. give algorithm using $o(1)$ words space regret \[ \sum_{i=1}^{k}\frac{1}{\delta_i}\log \frac{\delta_i}{\delta}\log \] $\delta_i$ gap best arm arm $i$ $\delta$ gap best second-best arms. rewards bounded away $0$ $1$, within $o(\log 1/\delta)$ factor optimum regret possible without space constraints.",4 "demystifying neural style transfer. neural style transfer recently demonstrated exciting results catches eyes academia industry. despite amazing results, principle neural style transfer, especially gram matrices could represent style remains unclear. paper, propose novel interpretation neural style transfer treating domain adaptation problem. specifically, theoretically show matching gram matrices feature maps equivalent minimize maximum mean discrepancy (mmd) second order polynomial kernel. thus, argue essence neural style transfer match feature distributions style images generated images. support standpoint, experiment several distribution alignment methods, achieve appealing results. believe novel interpretation connects two important research fields, could enlighten future researches.",4 "neural machine translation via binary code prediction. paper, propose new method calculating output layer neural machine translation systems. method based predicting binary code word reduce computation time/memory requirements output layer logarithmic vocabulary size best case. addition, also introduce two advanced approaches improve robustness proposed model: using error-correcting codes combining softmax binary codes. experiments two english-japanese bidirectional translation tasks show proposed models achieve bleu scores approach softmax, reducing memory usage order less 1/10 improving decoding speed cpus x5 x10.",4 "mining causal relationships: data-driven study islamic state. islamic state iraq al-sham (isis) dominant insurgent group operating iraq syria rose prominence took mosul june, 2014. paper, present data-driven approach analyzing group using dataset consisting 2200 incidents military activity surrounding isis forces oppose (including iraqi, syrian, american-led coalition). combine ideas logic programming causal reasoning mine association rules present evidence causality. present relationships link isis vehicle-bourne improvised explosive device (vbied) activity syria military operations iraq, coalition air strikes, isis ied activity, well rules may serve indicators spikes indirect fire, suicide attacks, arrests.",4 "infinite-horizon policy-gradient estimation. gradient-based approaches direct policy search reinforcement learning received much recent attention means solve problems partial observability avoid problems associated policy degradation value-function methods. paper introduce gpomdp, simulation-based algorithm generating biased estimate gradient average reward partially observable markov decision processes pomdps controlled parameterized stochastic policies. similar algorithm proposed (kimura et al. 1995). algorithm's chief advantages requires storage twice number policy parameters, uses one free beta (which natural interpretation terms bias-variance trade-off), requires knowledge underlying state. prove convergence gpomdp, show correct choice parameter beta related mixing time controlled pomdp. briefly describe extensions gpomdp controlled markov chains, continuous state, observation control spaces, multiple-agents, higher-order derivatives, version training stochastic policies internal states. companion paper (baxter et al., volume) show gradient estimates generated gpomdp used traditional stochastic gradient algorithm conjugate-gradient procedure find local optima average reward.",4 "tensor approach learning mixed membership community models. community detection task detecting hidden communities observed interactions. guaranteed community detection far mostly limited models non-overlapping communities stochastic block model. paper, remove restriction, provide guaranteed community detection family probabilistic network models overlapping communities, termed mixed membership dirichlet model, first introduced airoldi et al. model allows nodes fractional memberships multiple communities assumes community memberships drawn dirichlet distribution. moreover, contains stochastic block model special case. propose unified approach learning models via tensor spectral decomposition method. estimator based low-order moment tensor observed network, consisting 3-star counts. learning method fast based simple linear algebraic operations, e.g. singular value decomposition tensor power iterations. provide guaranteed recovery community memberships model parameters present careful finite sample analysis learning method. important special case, results match best known scaling requirements (homogeneous) stochastic block model.",4 "machine learning phonologically conditioned noun declensions tamil morphological generators. paper presents machine learning solutions practical problem natural language generation (nlg), particularly word formation agglutinative languages like tamil, supervised manner. morphological generator important component natural language processing artificial intelligence. generates word forms given root affixes. morphophonemic changes like addition, deletion, alternation etc., occur two morphemes words joined together. sandhi rules explicitly specified rule based morphological analyzers generators. machine learning framework, rules learned automatically system training samples subsequently applied new inputs. paper proposed machine learning models learn morphophonemic rules noun declensions given training data. models trained learn sandhi rules using various learning algorithms performance algorithms presented. conclude machine learning morphological processing word form generation successfully learned supervised manner, without explicit description rules. performance decision trees bayesian machine learning algorithms noun declensions discussed.",4 "probabilistic label relation graphs ising models. consider classification problems label space structure. common example hierarchical label spaces, corresponding case one label subsumes another (e.g., animal subsumes dog). labels also mutually exclusive (e.g., dog vs cat) unrelated (e.g., furry, carnivore). jointly model hierarchy exclusion relations, notion hex (hierarchy exclusion) graph introduced [7]. combined conditional random field (crf) deep neural network (dnn), resulting state art results applied visual object classification problems training labels drawn different levels imagenet hierarchy (e.g., image might labeled basic level category ""dog"", rather specific label ""husky""). paper, extend hex model allow soft probabilistic relations labels, useful uncertainty relationship two labels (e.g., antelope ""sort of"" furry, degree grizzly bear). call new model phex, probabilistic hex. show phex graph converted ising model, allows us use existing off-the-shelf inference methods (in contrast hex method, needed specialized inference algorithms). experimental results show significant improvements number large-scale visual object classification tasks, outperforming previous hex model.",4 "behavioral learning aircraft landing sequencing using society probabilistic finite state machines. air traffic control (atc) complex safety critical environment. tower controller would making many decisions real-time sequence aircraft. optimization tools exist help controller airports, even situations, real sequence aircraft adopted controller significantly different one proposed optimization algorithm. due dynamic nature environment. objective paper test hypothesis one learn sequence adopted controller strategies act heuristics decision support tools aircraft sequencing. aim tested paper attempting learn sequences generated well-known sequencing method used real world. approach relies genetic algorithm (ga) learn sequences using society probabilistic finite-state machines (pfsms). pfsm learns different sub-space; thus, decomposing learning problem group agents need work together learn overall problem. three sequence metrics (levenshtein, hamming position distances) compared fitness functions ga. results suggest, possible learn behavior algorithm/heuristic generated original sequence limited information.",4 "high-dimensional dynamics generalization error neural networks. perform average case analysis generalization dynamics large neural networks trained using gradient descent. study practically-relevant ""high-dimensional"" regime number free parameters network order even larger number examples dataset. using random matrix theory exact solutions linear models, derive generalization error training error dynamics learning analyze depend dimensionality data signal noise ratio learning problem. find dynamics gradient descent learning naturally protect overtraining overfitting large networks. overtraining worst intermediate network sizes, effective number free parameters equals number samples, thus reduced making network smaller larger. additionally, high-dimensional regime, low generalization error requires starting small initial weights. turn non-linear neural networks, show making networks large harm generalization performance. contrary, fact reduce overtraining, even without early stopping regularization sort. identify two novel phenomena underlying behavior overcomplete models: first, frozen subspace weights learning occurs gradient descent; second, statistical properties high-dimensional regime yield better-conditioned input correlations protect overtraining. demonstrate naive application worst-case theories rademacher complexity inaccurate predicting generalization performance deep neural networks, derive alternative bound incorporates frozen subspace conditioning effects qualitatively matches behavior observed simulation.",19 information extraction broadcast news. paper discusses development trainable statistical models extracting content television radio news broadcasts. particular concentrate statistical finite state models identifying proper names named entities broadcast speech. two models presented: first represents name class information word attribute; second represents word-word class-class transitions explicitly. common n-gram based formulation used models. task named entity identification characterized relatively sparse training data issues related smoothing discussed. experiments reported using darpa/nist hub-4e evaluation north american broadcast news.,4 "horizontally scalable submodular maximization. variety large-scale machine learning problems cast instances constrained submodular maximization. existing approaches distributed submodular maximization critical drawback: capacity - number instances fit memory - must grow data set size. practice, one provision many machines, capacity machine limited physical constraints. propose truly scalable approach distributed submodular maximization fixed capacity. proposed framework applies broad class algorithms constraints provides theoretical guarantees approximation factor available capacity. empirically evaluate proposed algorithm variety data sets demonstrate achieves performance competitive centralized greedy solution.",19 "applying ensemble learning method improving multi-label classification performance. recent years, multi-label classification problem become controversial issue. kind classification, sample associated set class labels. ensemble approaches supervised learning algorithms operator takes number learning algorithms, namely base-level algorithms combines outcomes make estimation. simplest form ensemble learning train base-level algorithms random subsets data let vote popular classifications average predictions base-level algorithms. study, ensemble learning method proposed improving multi-label classification evaluation criteria. compared method well-known base-level algorithms data sets. experiment results show proposed approach outperforms base well-known classifiers multi-label classification problem.",4 "relative succinctness sentential decision diagrams. sentential decision diagrams (sdds) introduced darwiche 2011 promising representation type used knowledge compilation. relative succinctness representation types important subject area. aim paper identify kind boolean functions represented sdds small size respect number variables functions defined on. reason sets boolean functions representable different representation types polynomial size investigated sdds compared representation types classical knowledge compilation map darwiche marquis. ordered binary decision diagrams (obdds) popular data structure boolean functions one representation types. sdds general obdds definition recently, boolean function presented polynomial sdd size exponential obdd size. result strengthened several ways. main result quasipolynomial simulation sdds equivalent unambiguous nondeterministic obdds, nondeterministic variant exists exactly one accepting computation satisfying input. side effect open problem relative succinctness sdds free binary decision diagrams (fbdds) general obdds answered.",4 approximate reflection symmetry point set: theory algorithm application. propose algorithm detect approximate reflection symmetry present set volumetrically distributed points belonging $\mathbb{r}^d$ containing distorted reflection symmetry pattern. pose problem detecting approximate reflection symmetry problem establishing correspondences points reflections determining reflection symmetry transformation. formulate optimization framework problem establishing correspondences amounts solving linear assignment problem problem determining reflection symmetry transformation amounts optimization problem smooth riemannian product manifold. proposed approach estimates symmetry distribution points descriptor independent. evaluate robustness approach varying amount distortion perfect reflection symmetry pattern perturb point different amount perturbation. demonstrate effectiveness method applying problem 2-d reflection symmetry detection along relevant comparisons.,4 "dependency parsing dilated iterated graph cnns. dependency parses effective way inject linguistic knowledge many downstream tasks, many practitioners wish efficiently parse sentences scale. recent advances gpu hardware enabled neural networks achieve significant gains previous best models, models still fail leverage gpus' capability massive parallelism due requirement sequential processing sentence. response, propose dilated iterated graph convolutional neural networks (dig-cnns) graph-based dependency parsing, graph convolutional architecture allows efficient end-to-end gpu parsing. experiments english penn treebank benchmark, show dig-cnns perform par best neural network parsers.",4 "densenet: implementing efficient convnet descriptor pyramids. convolutional neural networks (cnns) provide accurate object classification. extended perform object detection iterating dense selected proposed object regions. however, runtime detectors scales total number and/or area regions examine per image, training detectors may prohibitively slow. however, cnn classifier topologies, possible share significant work among overlapping regions classified. paper presents densenet, open source system computes dense, multiscale features convolutional layers cnn based object classifier. future work involve training efficient object detectors densenet feature descriptors.",4 "glimpse far future: understanding long-term crowd worker quality. microtask crowdsourcing increasingly critical creation extremely large datasets. result, crowd workers spend weeks months repeating exact tasks, making necessary understand behavior long periods time. utilize three large, longitudinal datasets nine million annotations collected amazon mechanical turk examine claims workers fatigue satisfice long periods, producing lower quality work. find that, contrary claims, workers extremely stable quality entire period. understand whether workers set quality based task's requirements acceptance, perform experiment vary required quality large crowdsourcing task. workers adjust quality based acceptance threshold: workers threshold continued working usual quality level, workers threshold self-selected task. capitalizing consistency, demonstrate possible predict workers' long-term quality using glimpse quality first five tasks.",4 "clicks there!: anonymizing photographer camera saturated society. recent years, social media played increasingly important role reporting world events. publication crowd-sourced photographs videos near real-time one reasons behind high impact. however, use camera draw photographer situation conflict. examples include use cameras regulators collecting evidence mafia operations; citizens collecting evidence corruption public service outlet; political dissidents protesting public rallies. cases, published images contain fairly unambiguous clues location photographer (scene viewpoint information). presence adversary operated cameras, easy identify photographer also combining leaked information photographs themselves. call camera location detection attack. propose review defense techniques attacks. defenses image obfuscation techniques protect camera-location information; current anonymous publication technologies help either. however, use view synthesis algorithms could promising step direction providing probabilistic privacy guarantees.",4 "diffusercam: lensless single-exposure 3d imaging. demonstrate compact easy-to-build computational camera single-shot 3d imaging. lensless system consists solely diffuser placed front standard image sensor. every point within volumetric field-of-view projects unique pseudorandom pattern caustics sensor. using physical approximation simple calibration scheme, solve large-scale inverse problem computationally efficient way. caustic patterns enable compressed sensing, exploits sparsity sample solve 3d voxels pixels 2d sensor. 3d voxel grid chosen match experimentally measured two-point optical resolution across field-of-view, resulting 100 million voxels reconstructed single 1.3 megapixel image. however, effective resolution varies significantly scene content. effect common wide range computational cameras, provide new theory analyzing resolution systems.",4 "integration lidar hyperspectral data land-cover classification: case study. paper, approach proposed fuse lidar hyperspectral data, considers spectral spatial information single framework. here, extended self-dual attribute profile (esdap) investigated extract spatial information hyperspectral data set. extract spectral information, well-known classifiers used support vector machines (svms), random forests (rfs), artificial neural networks (anns). proposed method accurately classify relatively volumetric data set cpu processing time real ill-posed situation balance number training samples number features. classification part proposed approach fully-automatic.",4 "normalization based k means clustering algorithm. k-means effective clustering technique used separate similar data groups based initial centroids clusters. paper, normalization based k-means clustering algorithm(n-k means) proposed. proposed n-k means clustering algorithm applies normalization prior clustering available data well proposed approach calculates initial centroids based weights. experimental results prove betterment proposed n-k means clustering algorithm existing k-means clustering algorithm terms complexity overall performance.",4 "certifying existence epipolar matrices. given set point correspondences two images, existence fundamental matrix necessary condition points images 3-dimensional scene imaged two pinhole cameras. camera calibration known one requires existence essential matrix. present efficient algorithm, using exact linear algebra, testing existence fundamental matrix. input number point correspondences. essential matrices, characterize solvability demazure polynomials. scenarios, determine linear subspaces intersect fixed set defined non-linear polynomials. conditions derive polynomials stated purely terms image coordinates. represent new class two-view invariants, free fundamental (resp.~essential)~matrices.",4 "automated auto-encoder correlation-based health-monitoring prognostic method machine bearings. paper studies intelligent ultimate technique health-monitoring prognostic common rotary machine components, particularly bearings. run-to-failure experiment, rich unsupervised features vibration sensory data extracted trained sparse auto-encoder. then, correlation extracted attributes initial samples (presumably healthy beginning test) succeeding samples calculated passed moving-average filter. normalized output named auto-encoder correlation-based (aec) rate stands informative attribute system depicting health status precisely identifying degradation starting point. show aec technique well-generalizes several run-to-failure tests. aec collects rich unsupervised features form vibration data fully autonomous. demonstrate superiority aec many state-of-the-art approaches health monitoring prognostic machine bearings.",4 "retinal vessel segmentation fundoscopic images generative adversarial networks. retinal vessel segmentation indispensable step automatic detection retinal diseases fundoscopic images. though many approaches proposed, existing methods tend miss fine vessels allow false positives terminal branches. let alone under-segmentation, over-segmentation also problematic quantitative studies need measure precise width vessels. paper, present method generates precise map retinal vessels using generative adversarial training. methods achieve dice coefficient 0.829 drive dataset 0.834 stare dataset state-of-the-art performance datasets.",4 "inference networks evaluation evidence: alternative analyses. inference networks variety important uses constructed persons quite different standpoints. discussed paper three different complementary methods generating analyzing probabilistic inference networks. first method, though eighty years old, useful knowledge representation task constructing probabilistic arguments. also useful heuristic device generating new forms evidence. two methods formally equivalent ways combining probabilities analysis inference networks. use three methods illustrated analysis mass evidence celebrated american law case.",4 "neural network based nonlinear weighted finite automata. weighted finite automata (wfa) expressively model functions defined strings inherently linear models. given recent successes nonlinear models machine learning, natural wonder whether ex-tending wfa nonlinear setting would beneficial. paper, propose novel model neural network based nonlinearwfa model (nl-wfa) along learning algorithm. learning algorithm inspired spectral learning algorithm wfaand relies nonlinear decomposition so-called hankel matrix, means auto-encoder network. expressive power nl-wfa proposed learning algorithm assessed synthetic real-world data, showing nl-wfa lead smaller model sizes infer complex grammatical structures data.",4 "face synthesis (fasy) system determining characteristics face image. paper aims determining characteristics face image extracting components. fasy (face synthesis) system face database retrieval new face generation system development. one main features generation requested face found existing database, allows continuous growing database also. generate new face image, need store face components database. designed new technique extract face components sophisticated method. extraction facial feature points analyzed components determine characteristics. extraction analysis stored components along characteristics face database later use face construction.",4 "real-time human pose estimation video convolutional neural networks. paper, present method real-time multi-person human pose estimation video utilizing convolutional neural networks. method aimed use case specific applications, good accuracy essential variation background poses limited. enables us use generic network architecture, accurate fast. divide problem two phases: (1) pre-training (2) finetuning. pre-training, network learned highly diverse input data publicly available datasets, finetuning train application specific data, record kinect. method differs state-of-the-art methods consider whole system, including person detector, pose estimator automatic way record application specific training material finetuning. method considerably faster many state-of-the-art methods. method thought replacement kinect, used higher level tasks, gesture control, games, person tracking, action recognition action tracking. achieved accuracy 96.8\% (pck@0.2) application specific data.",4 "learned multi-patch similarity. estimating depth map multiple views scene fundamental task computer vision. soon two viewpoints available, one faces basic question measure similarity across >2 image patches. surprisingly, direct solution exists, instead common fall back less robust averaging two-view similarities. encouraged success machine learning, particular convolutional neural networks, propose learn matching function directly maps multiple image patches scalar similarity score. experiments several multi-view datasets demonstrate approach advantages methods based pairwise patch similarity.",4 "bet independence. study problem nonparametric dependence detection. many existing methods suffer severe power loss due non-uniform consistency, illustrate paradox. avoid power loss, approach nonparametric test independence new framework binary expansion statistics (bestat) binary expansion testing (bet), examine dependence novel binary expansion filtration approximation copula. hadamard-walsh transform, find cross interactions binary variables filtration complete sufficient statistics dependence. interactions also uncorrelated null. utilizing interactions, bet avoids problem non-uniform consistency improves upon wide class commonly used methods (a) achieving minimax rate sample size requirement specified power (b) providing clear interpretations global local relationships upon rejection independence. binary expansion approach also connects test statistics current computing system facilitate efficient bitwise implementation. illustrate bet study distribution stars night sky exploratory data analysis tcga breast cancer data.",12 "assessment algorithms mitosis detection breast cancer histopathology images. proliferative activity breast tumors, routinely estimated counting mitotic figures hematoxylin eosin stained histology sections, considered one important prognostic markers. however, mitosis counting laborious, subjective may suffer low inter-observer agreement. wider acceptance whole slide images pathology labs, automatic image analysis proposed potential solution issues. paper, results assessment mitosis detection algorithms 2013 (amida13) challenge described. challenge based data set consisting 12 training 11 testing subjects, one thousand annotated mitotic figures multiple observers. short descriptions results evaluation eleven methods presented. top performing method error rate comparable inter-observer agreement among pathologists.",4 "hierarchized block wise image approximation greedy pursuit strategies. approach effective implementation greedy selection methodologies, approximate image partitioned blocks, proposed. method specially designed approximating partitions transformed image. evolves selecting, iteration step, i) elements approximating blocks partitioning image ii) hierarchized sequence blocks approximated reach required global condition sparsity.",4 "abstract syntax networks code generation semantic parsing. tasks like code generation semantic parsing require mapping unstructured (or partially structured) inputs well-formed, executable outputs. introduce abstract syntax networks, modeling framework problems. outputs represented abstract syntax trees (asts) constructed decoder dynamically-determined modular structure paralleling structure output tree. benchmark hearthstone dataset code generation, model obtains 79.2 bleu 22.7% exact match accuracy, compared previous state-of-the-art values 67.1 6.1%. furthermore, perform competitively atis, jobs, geo semantic parsing datasets task-specific engineering.",4 "robust image registration via empirical mode decomposition. spatially varying intensity noise common source distortion images. bias field noise one example distortion often present magnetic resonance (mr) images. paper, first show empirical mode decomposition (emd) considerably reduce bias field noise mr images. then, propose two hierarchical multi-resolution emd-based algorithms robust registration images presence spatially varying noise. one algorithm (lr-emd) based registering emd feature-maps floating reference images various resolution levels. second algorithm (afr-emd), first extract average feature-map based emd floating reference images. then, use simple hierarchical multi-resolution algorithm based downsampling register average feature-maps. algorithms achieve lower error rate higher convergence percentage compared intensity-based hierarchical registration. specifically, using mutual information similarity measure, afr-emd achieves 42% lower error rate intensity 52% lower error rate transformation compared intensity-based hierarchical registration. lr-emd, error rate 32% lower intensity 41% lower transformation.",4 "asynchronous distributed variational gaussian processes regression. gaussian processes (gps) powerful non-parametric function estimators. however, applications largely limited expensive computational cost inference procedures. existing stochastic distributed synchronous variational inferences, although alleviated issue scaling gps millions samples, still far satisfactory real-world large applications, data sizes often orders magnitudes larger, say, billions. solve problem, propose advgp, first asynchronous distributed variational gaussian process inference regression, recent large-scale machine learning platform, parameterserver. advgp uses novel, flexible variational framework based weight space augmentation, implements highly efficient, asynchronous proximal gradient optimization. maintaining comparable better predictive performance, advgp greatly improves upon efficiency existing variational methods. advgp, effortlessly scale gp regression real-world application billions samples demonstrate excellent, superior prediction accuracy popular linear models.",19 "denoising adversarial autoencoders: classifying skin lesions using limited labelled training data. propose novel deep learning model classifying medical images setting large amount unlabelled medical data available, labelled data limited supply. consider specific case classifying skin lesions either malignant benign. setting, proposed approach -- semi-supervised, denoising adversarial autoencoder -- able utilise vast amounts unlabelled data learn representation skin lesions, small amounts labelled data assign class labels based learned representation. analyse contributions adversarial denoising components model find combination yields superior classification performance setting limited labelled training data.",4 "tagging multimedia stimuli ontologies. successful management emotional stimuli pivotal issue concerning affective computing (ac) related research. subfield artificial intelligence, ac concerned design computer systems accompanying hardware recognize, interpret, process human emotions, also development systems trigger human emotional response ordered controlled manner. requires maximum attainable precision efficiency extraction data emotionally annotated databases databases use keywords tags description semantic content, provide either necessary flexibility leverage needed efficiently extract pertinent emotional content. therefore, extent propose introduction ontologies new paradigm description emotionally annotated data. ability select sequence data based semantic attributes vital study involving metadata, semantics ontological sorting like semantic web social semantic desktop, approach described paper facilitates reuse areas well.",4 "nonparanormal: semiparametric estimation high dimensional undirected graphs. recent methods estimating sparse undirected graphs real-valued data high dimensional problems rely heavily assumption normality. show use semiparametric gaussian copula--or ""nonparanormal""--for high dimensional inference. additive models extend linear models replacing linear functions set one-dimensional smooth functions, nonparanormal extends normal transforming variables smooth functions. derive method estimating nonparanormal, study method's theoretical properties, show works well many examples.",19 "evaluating semantic models word-sentence relatedness. semantic textual similarity (sts) systems designed encode evaluate semantic similarity words, phrases, sentences, documents. one method assessing quality authenticity semantic information encoded systems comparison human judgments. data set evaluating semantic models developed consisting 775 english word-sentence pairs, annotated semantic relatedness human raters engaged maximum difference scaling (mds) task, well faster alternative task. sample application relatedness data, behavior-based relatedness compared relatedness computed via four off-the-shelf sts models: n-gram, latent semantic analysis (lsa), word2vec, umbc ebiquity. sts models captured much variance human judgments collected, sensitive implicatures entailments processed considered participants. text stimuli judgment data made freely available.",4 "best practice cnns applied visual instance retrieval?. previous work shown feature maps deep convolutional neural networks (cnns) interpreted feature representation particular image region. features aggregated feature maps exploited image retrieval tasks achieved state-of-the-art performances recent years. key success methods feature representation. however, different factors impact effectiveness features still explored thoroughly. much less discussion best combination them. main contribution paper thorough evaluations various factors affect discriminative ability features extracted cnns. based evaluation results, also identify best choices different factors propose new multi-scale image feature representation method encode image effectively. finally, show proposed method generalises well outperforms state-of-the-art methods four typical datasets used visual instance retrieval.",4 "computational cost reduction learned transform classifications. present theoretical analysis empirical evaluations novel set techniques computational cost reduction classifiers based learned transform soft-threshold. modifying optimization procedures dictionary classifier training, well resulting dictionary entries, techniques allow reduce bit precision replace floating-point multiplication single integer bit shift. also show optimization algorithms dictionary training methods modified penalize higher-energy dictionaries. applied techniques classifier learning algorithm soft-thresholding, testing datasets used original paper. results indicate feasible use solely sums bit shifts integers classify test time limited reduction classification accuracy. low power operations valuable trade fpga implementations increase classification throughput decrease energy consumption manufacturing cost.",4 "multi-scale deep learning architectures person re-identification. person re-identification (re-id) aims match people across non-overlapping camera views public space. challenging problem many people captured surveillance videos wear similar clothes. consequently, differences appearance often subtle detectable right location scales. existing re-id models, particularly recently proposed deep learning based ones match people single scale. contrast, paper, novel multi-scale deep learning model proposed. model able learn deep discriminative feature representations different scales automatically determine suitable scales matching. importance different spatial locations extracting discriminative features also learned explicitly. experiments carried demonstrate proposed model outperforms state-of-the art number benchmarks",4 "spatially encoding temporal correlations classify temporal data using convolutional neural networks. propose off-line approach explicitly encode temporal patterns spatially different types images, namely, gramian angular fields markov transition fields. enables use techniques computer vision feature learning classification. used tiled convolutional neural networks learn high-level features individual gaf, mtf, gaf-mtf images 12 benchmark time series datasets two real spatial-temporal trajectory datasets. classification results approach competitive state-of-the-art approaches types data. analysis features weights learned cnns explains approach works.",4 "benefit combining neural, statistical external features fake news identification. identifying veracity news article interesting problem automating process challenging task. detection news article fake still open question contingent many factors current state-of-the-art models fail incorporate. paper, explore subtask fake news identification, stance detection. given news article, task determine relevance body claim. present novel idea combines neural, statistical external features provide efficient solution problem. compute neural embedding deep recurrent model, statistical features weighted n-gram bag-of-words model handcrafted external features help feature engineering heuristics. finally, using deep neural layer features combined, thereby classifying headline-body news pair agree, disagree, discuss, unrelated. compare proposed technique current state-of-the-art models fake news challenge dataset. extensive experiments, find proposed model outperforms state-of-the-art techniques including submissions fake news challenge.",4 "corpus based enrichment germanet verb frames. lexical semantic resources, like wordnet, often used real applications natural language document processing. example, integrated germanet document suite xdoc processing german forensic autopsy protocols. addition hypernymy synonymy relation, want adapt germanet's verb frames analysis. paper outline approach domain related enrichment germanet verb frames corpus based syntactic co-occurred data analyses real documents.",4 "scalable out-of-sample extension graph embeddings using deep neural networks. several popular graph embedding techniques representation learning dimensionality reduction rely performing computationally expensive eigendecompositions derive nonlinear transformation input data space. resulting eigenvectors encode embedding coordinates training samples only, embedding novel data samples requires costly computation. paper, present method out-of-sample extension graph embeddings using deep neural networks (dnn) parametrically approximate nonlinear maps. compared traditional nonparametric out-of-sample extension methods, demonstrate dnns generalize equal better fidelity require orders magnitude less computation test time. moreover, find unsupervised pretraining dnns improves optimization larger network sizes, thus removing sensitivity model selection.",19 "multimodal named entity recognition short social media posts. introduce new task called multimodal named entity recognition (mner) noisy user-generated data tweets snapchat captions, comprise short text accompanying images. social media posts often come inconsistent incomplete syntax lexical notations limited surrounding textual contexts, bringing significant challenges ner. end, create new dataset mner called snapcaptions (snapchat image-caption pairs submitted public crowd-sourced stories fully annotated named entities). build upon state-of-the-art bi-lstm word/character based ner models 1) deep image network incorporates relevant visual context augment textual information, 2) generic modality-attention module learns attenuate irrelevant modalities amplifying informative ones extract contexts from, adaptive sample token. proposed mner model modality attention significantly outperforms state-of-the-art text-only ner models successfully leveraging provided visual contexts, opening potential applications mner myriads social media platforms.",4 "stochastic gradient estimate variance contrastive divergence persistent contrastive divergence. contrastive divergence (cd) persistent contrastive divergence (pcd) popular methods training weights restricted boltzmann machines. however, methods use approximate method sampling model distribution. side effect, approximations yield significantly different biases variances stochastic gradient estimates individual data points. well known cd yields biased gradient estimate. paper however show empirically cd lower stochastic gradient estimate variance exact sampling, mean subsequent pcd estimates higher variance exact sampling. results give one explanation finding cd used smaller minibatches higher learning rates pcd.",4 "comparison echo state network output layer classification methods noisy data. echo state networks recently developed type recurrent neural network internal layer fixed random weights, output layer trained specific data. echo state networks increasingly used process spatiotemporal data real-world settings, including speech recognition, event detection, robot control. strength echo state networks simple method used train output layer - typically collection linear readout weights found using least squares approach. although straightforward train low computational cost use, method may yield acceptable accuracy performance noisy data. study compares performance three echo state network output layer methods perform classification noisy data: using trained linear weights, using sparse trained linear weights, using trained low-rank approximations reservoir states. methods investigated experimentally synthetic natural datasets. experiments suggest using regularized least squares train linear output weights superior data low noise, using low-rank approximations may significantly improve accuracy datasets contaminated higher noise levels.",4 "virtual sensor modelling using neural networks coefficient-based adaptive weights biases search algorithm diesel engines. explosion field big data introduction stringent emission norms every three five years, automotive companies must continue enhance fuel economy ratings products, also provide valued services customers delivering engine performance health reports regular intervals. reasonable solution issues installing variety sensors engine. sensor data used develop fuel economy features directly indicate engine performance. however, mounting plethora sensors impractical cost-sensitive industry. thus, virtual sensors replace physical sensors reducing cost capturing essential engine data.",4 "gradient descent learns one-hidden-layer cnn: afraid spurious local minima. consider problem learning one-hidden-layer neural network non-overlapping convolutional layer relu activation function, i.e., $f(\mathbf{z}; \mathbf{w}, \mathbf{a}) = \sum_j a_j\sigma(\mathbf{w}^\top\mathbf{z}_j)$, convolutional weights $\mathbf{w}$ output weights $\mathbf{a}$ parameters learned. prove gaussian input $\mathbf{z}$, spurious local minimum global mininum. surprisingly, presence local minimum, starting randomly initialized weights, gradient descent weight normalization still proven recover true parameters constant probability (which boosted arbitrarily high accuracy multiple restarts). also show constant probability, procedure could also converge spurious local minimum, showing local minimum plays non-trivial role dynamics gradient descent. furthermore, quantitative analysis shows gradient descent dynamics two phases: starts slow, converges much faster several iterations.",4 "msr-net:low-light image enhancement using deep convolutional network. images captured low-light conditions usually suffer low contrast, increases difficulty subsequent computer vision tasks great extent. paper, low-light image enhancement model based convolutional neural network retinex theory proposed. firstly, show multi-scale retinex equivalent feedforward convolutional neural network different gaussian convolution kernels. motivated fact, consider convolutional neural network(msr-net) directly learns end-to-end mapping dark bright images. different fundamentally existing approaches, low-light image enhancement paper regarded machine learning problem. model, parameters optimized back-propagation, parameters traditional models depend artificial setting. experiments number challenging images reveal advantages method comparison state-of-the-art methods qualitative quantitative perspective.",4 "scalable latent tree model application health analytics. present integrated approach structure parameter estimation latent tree graphical models, nodes hidden. overall approach follows ""divide-and-conquer"" strategy learns models small groups variables iteratively merges global solution. structure learning involves combinatorial operations minimum spanning tree construction local recursive grouping; parameter learning based method moments tensor decompositions. method guaranteed correctly recover unknown tree structure model parameters low sample complexity class linear multivariate latent tree models includes discrete gaussian distributions, gaussian mixtures. bulk asynchronous parallel algorithm implemented parallel using openmp framework scales logarithmically number variables linearly dimensionality variable. experiments confirm high degree efficiency accuracy large datasets electronic health records. proposed algorithm also generates intuitive clinically meaningful disease hierarchies.",4 "travel time estimation using floating car data. report explores use machine learning techniques accurately predict travel times city streets highways using floating car data (location information user vehicles road network). aim report twofold, first present general architecture solving problem, present evaluate techniques real floating car data gathered month 5 km highway new delhi.",4 "quantifying mesoscale neuroanatomy using x-ray microtomography. methods resolving 3d microstructure brain typically start thinly slicing staining brain, imaging individual section visible light photons electrons. contrast, x-rays used image thick samples, providing rapid approach producing large 3d brain maps without sectioning. demonstrate use synchrotron x-ray microtomography ($\mu$ct) producing mesoscale $(1~\mu m^3)$ resolution brain maps millimeter-scale volumes mouse brain. introduce pipeline $\mu$ct-based brain mapping combines methods sample preparation, imaging, automated segmentation image volumes cells blood vessels, statistical analysis resulting brain structures. results demonstrate x-ray tomography promises rapid quantification large brain volumes, complementing brain mapping connectomics efforts.",16 "obda constraints effective query answering (extended version). ontology based data access (obda) users pose sparql queries ontology lies top relational datasources. queries translated on-the-fly sql queries obda systems. standard sparql-to-sql translation techniques obda often produce sql queries containing redundant joins unions, even number semantic structural optimizations. redundancies detrimental performance query answering, especially complex industrial obda scenarios large enterprise databases. address issue, introduce two novel notions obda constraints show exploit efficient query answering. conduct extensive set experiments large datasets using real world data queries, showing techniques strongly improve performance query answering orders magnitude.",4 "high-order attention models visual question answering. quest algorithms enable cognitive abilities important part machine learning. common trait many recently investigated cognitive-like tasks take account different data modalities, visual textual input. paper propose novel generally applicable form attention mechanism learns high-order correlations various data modalities. show high-order correlations effectively direct appropriate attention relevant elements different data modalities required solve joint task. demonstrate effectiveness high-order attention mechanism task visual question answering (vqa), achieve state-of-the-art performance standard vqa dataset.",4 "attention-based information fusion using multi-encoder-decoder recurrent neural networks. rising number interconnected devices sensors, modeling distributed sensor networks increasing interest. recurrent neural networks (rnn) considered particularly well suited modeling sensory streaming data. predicting future behavior, incorporating information neighboring sensor stations often beneficial. propose new rnn based architecture context specific information fusion across multiple spatially distributed sensor stations. hereby, latent representations multiple local models, modeling one sensor station, jointed weighted, according importance prediction. particular importance assessed depending current context using separate attention function. demonstrate effectiveness model three different real-world sensor network datasets.",4 "moment based estimation stochastic kronecker graph parameters. stochastic kronecker graphs supply parsimonious model large sparse real world graphs. specify distribution large random graph using three four parameters. parameters however proved difficult choose specific applications. article looks method moments estimators computationally much simpler maximum likelihood. estimators fast examples, typically yield kronecker parameters expected feature counts closer given graph get kronfit. improvement especially prominent number triangles graph.",19 "accurate facial parts localization deep learning 3d facial expression recognition. meaningful facial parts convey key cues facial action unit detection expression prediction. textured 3d face scan provide detailed 3d geometric shape 2d texture appearance cues face beneficial facial expression recognition (fer). however, accurate facial parts extraction well fusion challenging tasks. paper, novel system 3d fer designed based accurate facial parts extraction deep feature fusion facial parts. particular, textured 3d face scan firstly represented 2d texture map depth map one-to-one dense correspondence. then, facial parts texture map depth map extracted using novel 4-stage process consists facial landmark localization, facial rotation correction, facial resizing, facial parts bounding box extraction post-processing procedures. finally, deep fusion convolutional neural networks (cnns) features facial parts learned texture maps depth maps, respectively nonlinear svms used expression prediction. experiments conducted bu-3dfe database, demonstrating effectiveness combing different facial parts, texture depth cues reporting state-of-the-art results comparison existing methods setting.",4 "efficient construction local parametric reduced order models using machine learning techniques. reduced order models computationally inexpensive approximations capture important dynamical characteristics large, high-fidelity computer models physical systems. paper applies machine learning techniques improve design parametric reduced order models. specifically, machine learning used develop feasible regions parameter space admissible target accuracy achieved predefined reduced order basis, construct parametric maps, chose best two already existing bases new parameter configuration accuracy point view pre-select optimal dimension reduced basis meet desired accuracy. combining available information using bases concatenation interpolation well high-fidelity solutions interpolation able build accurate reduced order models associated new parameter settings. promising numerical results viscous burgers model illustrate potential machine learning approaches help design better reduced order models.",4 "generating news headlines recurrent neural networks. describe application encoder-decoder recurrent neural network lstm units attention generating headlines text news articles. find model quite effective concisely paraphrasing news articles. furthermore, study neural network decides input words pay attention to, specifically identify function different neurons simplified attention mechanism. interestingly, simplified attention mechanism performs better complex attention mechanism held set articles.",4 "network statistics early english syntax: structural criteria. paper includes reflection role networks study english language acquisition, well collection practical criteria annotate free-speech corpora children utterances. theoretical level, main claim paper syntactic networks interpreted outcome use syntactic machinery. thus, intrinsic features machinery accessible directly (known) network properties. rather, one see global patterns use and, thus, global view power organization underlying grammar. taking look practical issues, paper examines build net projection syntactic relations. recall that, opposed adult grammars, early-child language well-defined concept structure. overcome difficulty, develop set systematic criteria assuming constituency hierarchy grammar based lexico-thematic relations. end, obtain well defined corpora annotation enables us i) perform statistics size structures ii) build network syntactic relations perform standard measures complexity. also provide detailed example.",4 "ff planning system: fast plan generation heuristic search. describe evaluate algorithmic techniques used ff planning system. like hsp system, ff relies forward state space search, using heuristic estimates goal distances ignoring delete lists. unlike hsp's heuristic, method assume facts independent. introduce novel search strategy combines hill-climbing systematic search, show powerful heuristic information extracted used prune search space. ff successful automatic planner recent aips-2000 planning competition. review results competition, give data benchmark domains, investigate reasons runtime performance ff compared hsp.",4 "human communication systems evolve cultural selection. human communication systems, language, evolve culturally; components undergo reproduction variation. however, role selection cultural evolutionary dynamics less clear. often neutral evolution (also known 'drift') models, used explain evolution human communication systems, cultural evolution generally. account, cultural change unbiased: instance, vocabulary, baby names pottery designs found spread random copying. drift null hypothesis models cultural evolution always adequately explain empirical results. alternative models include cultural selection, assumes variant adoption biased. theoretical models human communication argue conversation interlocutors biased adopt labels aspects linguistic representation (including prosody syntax). basic alignment mechanism extended computer simulation account emergence linguistic conventions. agents biased match linguistic behavior interlocutor, single variant propagate across entire population interacting computer agents. behavior-matching account operates level individual. call conformity-biased model. different selection account, called content-biased selection, functional selection replicator selection, variant adoption depends upon intrinsic value particular variant (e.g., ease learning use). second alternative account operates level cultural variant. following boyd richerson call content-biased model. present paper tests drift model two biased selection models' ability explain spread communicative signal variants experimental micro-society.",4 "optimizing non-decomposable performance measures: tale two classes. modern classification problems frequently present mild severe label imbalance well specific requirements classification characteristics, require optimizing performance measures non-decomposable dataset, f-measure. measures spurred much interest pose specific challenges learning algorithms since non-additive nature precludes direct application well-studied large scale optimization methods stochastic gradient descent. paper reveal two large families performance measures expressed functions true positive/negative rates, indeed possible implement point stochastic updates. families consider concave pseudo-linear functions tpr, tnr cover several popularly used performance measures f-measure, g-mean h-mean. core contribution adaptive linearization scheme families, using develop optimization techniques enable truly point-based stochastic updates. concave performance measures propose spade, stochastic primal dual solver; pseudo-linear measures propose stamp, stochastic alternate maximization procedure. methods crisp convergence guarantees, demonstrate significant speedups existing methods - often order magnitude more, give similar accurate predictions test data.",19 "tap-dlnd 1.0 : corpus document level novelty detection. detecting novelty entire document artificial intelligence (ai) frontier problem widespread nlp applications, extractive document summarization, tracking development news events, predicting impact scholarly articles, etc. important though problem is, unaware benchmark document level data correctly addresses evaluation automatic novelty detection techniques classification framework. bridge gap, present resource benchmarking techniques document level novelty detection. create resource via event-specific crawling news documents across several domains periodic manner. release annotated corpus necessary statistics show use developed system problem concern.",4 "ocular dominance patterns mammalian visual cortex: wire length minimization approach. propose theory ocular dominance (od) patterns mammalian primary visual cortex. theory based premise od pattern adaptation minimize length intra-cortical wiring. thus understand existing od patterns solving wire length minimization problem. divide neurons two classes: left-eye dominated right-eye dominated. find segregation neurons monocular regions reduces wire length number connections neurons class differs class. shape regions depends relative fraction neurons two classes. numbers close find optimal od pattern consists interdigitating stripes. one class less numerous other, optimal od pattern consists patches first class neurons sea class neurons. predict transition stripes patches fraction neurons dominated ipsilateral eye 40%. prediction agrees data macaque cebus monkeys. theory applied binary cortical systems.",3 "reference-aware language models. propose general class language models treat reference explicit stochastic latent variable. architecture allows models create mentions entities attributes accessing external databases (required by, e.g., dialogue generation recipe generation) internal state (required by, e.g. language models aware coreference). facilitates incorporation information accessed predictable locations databases discourse context, even targets reference may rare words. experiments three tasks shows model variants based deterministic attention.",4 "spatio-temporal facial expression recognition using convolutional neural networks conditional random fields. automated facial expression recognition (fer) challenging task decades. many existing works use hand-crafted features lbp, hog, lpq, histogram optical flow (hof) combined classifiers support vector machines expression recognition. methods often require rigorous hyperparameter tuning achieve good results. recently deep neural networks (dnn) shown outperform traditional methods visual object recognition. paper, propose two-part network consisting dnn-based architecture followed conditional random field (crf) module facial expression recognition videos. first part captures spatial relation within facial images using convolutional layers followed three inception-resnet modules two fully-connected layers. capture temporal relation image frames, use linear chain crf second part network. evaluate proposed network three publicly available databases, viz. ck+, mmi, fera. experiments performed subject-independent cross-database manners. experimental results show cascading deep network architecture crf module considerably increases recognition facial expressions videos particular outperforms state-of-the-art methods cross-database experiments yields comparable results subject-independent experiments.",4 "supervised generative reconstruction: efficient way flexibly store recognize patterns. matching animal-like flexibility recognition ability quickly incorporate new information remains difficult. limits yet adequately addressed neural models recognition algorithms. work proposes configuration recognition maintains function conventional algorithms avoids combinatorial problems. feedforward recognition algorithms classical artificial neural networks machine learning algorithms known subject catastrophic interference forgetting. modifying learning new information (associations patterns labels) causes loss previously learned information. demonstrate using mathematical analysis supervised generative models, feedforward feedback connections, emulate feedforward algorithms yet avoid catastrophic interference forgetting. learned information generative models stored intuitive form represents fixed points solutions network moreover displays similar difficulties cognitive phenomena. brain-like capabilities limits associated generative models suggest brain may perform recognition store information using similar approach. central role recognition, progress understanding underlying principles may reveal significant insight better study integrate brain.",4 "neural machine translation benefit larger context?. propose neural machine translation architecture models surrounding text addition source sentence. models lead better performance, terms general translation quality pronoun prediction, trained small corpora, although improvement largely disappears trained larger corpus. also discover attention-based neural machine translation well suited pronoun prediction compares favorably approaches specifically designed task.",19 "similarité en intension vs en extension : à la croisée de l'informatique et du théâtre. traditional staging based formal approach similarity leaning dramaturgical ontologies instanciation variations. inspired interactive data mining, suggests different approaches, give overview computer science theater researches using computers partners actor escape priori specification roles.",4 "shape tracking occlusions via coarse-to-fine region-based sobolev descent. present method track precise shape object video based new modeling optimization new riemannian manifold parameterized regions. joint dynamic shape appearance models, template object propagated match object shape radiance next frame, advantageous methods employing global image statistics cases complex object radiance cluttered background. cases 3d object motion viewpoint change, self-occlusions dis-occlusions object prominent, current methods employing joint shape appearance models unable adapt new shape appearance information, leading inaccurate shape detection. work, model self-occlusions dis-occlusions joint shape appearance tracking framework. self-occlusions warp propagate template coupled, thus joint problem formulated. derive coarse-to-fine optimization scheme, advantageous object tracking, initially perturbs template coarse perturbations transitioning finer-scale perturbations, traversing scales, seamlessly automatically. scheme gradient descent novel infinite-dimensional riemannian manifold introduce. manifold consists planar parameterized regions, metric introduce novel sobolev-type metric defined infinitesimal vector fields regions. metric property resulting gradient descent automatically favors coarse-scale deformations (when reduce energy) moving finer-scale deformations. experiments video exhibiting occlusion/dis-occlusion, complex radiance background show occlusion/dis-occlusion modeling leads superior shape accuracy compared recent methods employing joint shape/appearance models employing global statistics.",4 "constrained fractional set programs application local clustering community detection. (constrained) minimization ratio set functions problem frequently occurring clustering community detection. optimization problems typically np-hard, one uses convex spectral relaxations practice. relaxations solved globally optimally, often loose thus lead results far away optimum. paper show every constrained minimization problem ratio non-negative set functions allows tight relaxation unconstrained continuous optimization problem. result leads flexible framework solving constrained problems network analysis. globally optimal solution resulting non-convex problem cannot guaranteed, outperform loose convex spectral relaxations large margin constrained local clustering problems.",19 "redefining part-of-speech classes distributional semantic models. paper studies word embeddings trained british national corpus interact part speech boundaries. work targets universal pos tag set, currently actively used annotation range languages. experiment training classifiers predicting pos tags words based embeddings. results show information pos affiliation contained distributional vectors allows us discover groups words distributional patterns differ words part speech. data often reveals hidden inconsistencies annotation process guidelines. time, supports notion `soft' `graded' part speech affiliations. finally, show information pos distributed among dozens vector components, limited one two features.",4 "fully adaptive algorithm pure exploration linear bandits. propose first fully-adaptive algorithm pure exploration linear bandits---the task find arm largest expected reward, depends unknown parameter linearly. existing methods partially entirely fix sequences arm selections observing rewards, method adaptively changes arm selection strategy based past observations round. show sample complexity matches achievable lower bound constant factor extreme case. furthermore, evaluate performance methods simulations based synthetic setting real-world data, method shows vast improvement existing methods.",19 "semantic texture robust dense tracking. argue robust dense slam systems make valuable use layers features coming standard cnn pyramid `semantic texture' suitable dense alignment much robust nuisance factors lighting raw rgb values. use straightforward lucas-kanade formulation image alignment, schedule iterations coarse-to-fine levels pyramid, simply replace usual image pyramid hierarchy convolutional feature maps pre-trained cnn. resulting dense alignment performance much robust lighting variations, show camera rotation tracking experiments time-lapse sequences captured many hours. looking towards future scene representation real-time visual slam, demonstrate selection using simple criteria small number total set features output cnn gives accurate much efficient tracking performance.",4 "automatic detection fake news. proliferation misleading information everyday access media outlets social media feeds, news blogs, online newspapers made challenging identify trustworthy news sources, thus increasing need computational tools able provide insights reliability online content. paper, focus automatic identification fake content online news. contribution twofold. first, introduce two novel datasets task fake news detection, covering seven different news domains. describe collection, annotation, validation process detail present several exploratory analysis identification linguistic differences fake legitimate news content. second, conduct set learning experiments build accurate fake news detectors. addition, provide comparative analyses automatic manual identification fake news.",4 "using noisy extractions discover causal knowledge. knowledge bases (kb) constructed information extraction text play important role query answering reasoning. work, study particular reasoning task, problem discovering causal relationships entities, known causal discovery. two contrasting types approaches discovering causal knowledge. one approach attempts identify causal relationships text using automatic extraction techniques, approach infers causation observational data. however, extractions alone often insufficient capture complex patterns full observational data expensive obtain. introduce probabilistic method fusing noisy extractions observational data discover causal knowledge. propose principled approach uses probabilistic soft logic (psl) framework encode well-studied constraints recover long-range patterns consistent predictions, cheaply acquired extractions provide proxy unseen observations. apply method gene regulatory networks show promise exploiting kb signals causal discovery, suggesting critical, new area research.",4 "sense embedding learning word sense induction. conventional word sense induction (wsi) methods usually represent instance discrete linguistic features cooccurrence features, train model polysemous word individually. work, propose learn sense embeddings wsi task. training stage, method induces several sense centroids (embedding) polysemous word. testing stage, method represents instance contextual vector, induces sense finding nearest sense centroid embedding space. advantages method (1) distributed sense vectors taken knowledge representations trained discriminatively, usually better performance traditional count-based distributional models, (2) general model whole vocabulary jointly trained induce sense centroids mutlitask learning framework. evaluated semeval-2010 wsi dataset, method outperforms participants recent state-of-the-art methods. verify two advantages comparing carefully designed baselines.",4 "web page categorization using artificial neural networks. web page categorization one challenging tasks world ever increasing web technologies. many ways categorization web pages based different approach features. paper proposes new dimension way categorization web pages using artificial neural network (ann) extracting features automatically. eight major categories web pages selected categorization; business & economy, education, government, entertainment, sports, news & media, job search, science. whole process proposed system done three successive stages. first stage, features automatically extracted analyzing source web pages. second stage includes fixing input values neural network; values remain 0 1. variations values affect output. finally third stage determines class certain web page eight predefined classes. stage done using back propagation algorithm artificial neural network. proposed concept facilitate web mining, retrievals information web also search engines.",4 "information-theoretic limits bayesian network structure learning. paper, study information-theoretic limits learning structure bayesian networks (bns), discrete well continuous random variables, finite number samples. show minimum number samples required procedure recover correct structure grows $\omega(m)$ $\omega(k \log + (k^2/m))$ non-sparse sparse bns respectively, $m$ number variables $k$ maximum number parents per node. provide simple recipe, based extension fano's inequality, obtain information-theoretic limits structure recovery exponential family bn. instantiate result specific conditional distributions exponential family characterize fundamental limits learning various commonly used bns, conditional probability table based networks, gaussian bns, noisy-or networks, logistic regression networks. en route obtaining main results, obtain tight bounds number sparse non-sparse essential-dags. finally, byproduct, recover information-theoretic limits sparse variable selection logistic regression.",4 "discussion: latent variable graphical model selection via convex optimization. discussion ""latent variable graphical model selection via convex optimization"" venkat chandrasekaran, pablo a. parrilo alan s. willsky [arxiv:1008.1290].",12 "nonconvex matrix factorization rank-one measurements. consider problem recovering low-rank matrices random rank-one measurements, spans numerous applications including covariance sketching, phase retrieval, quantum state tomography, learning shallow polynomial neural networks, among others. approach directly estimate low-rank factor minimizing nonconvex quadratic loss function via vanilla gradient descent, following tailored spectral initialization. true rank small, algorithm guaranteed converge ground truth (up global ambiguity) near-optimal sample complexity computational complexity. best knowledge, first guarantee achieves near-optimality metrics. particular, key enabler near-optimal computational guarantees implicit regularization phenomenon: without explicit regularization, spectral initialization gradient descent iterates automatically stay within region incoherent measurement vectors. feature allows one employ much aggressive step sizes compared ones suggested prior literature, without need sample splitting.",4 "parallel statistical multi-resolution estimation. discuss several strategies implement dykstra's projection algorithm nvidia's compute unified device architecture (cuda). dykstra's algorithm central step computationally expensive part statistical multi-resolution methods. projects given vector onto intersection convex sets. compared cpu implementation cuda implementation one order magnitude faster. speed reduce memory consumption developed new variant, call incomplete dykstra's algorithm. implemented cuda one order magnitude faster cuda implementation standard dykstra algorithm. sample application discuss using incomplete dykstra's algorithm preprocessor recently developed super-resolution optical fluctuation imaging (sofi) method (dertinger et al. 2009). show statistical multi-resolution estimation enhance resolution improvement plain sofi algorithm fourier-reweighting sofi. results compared terms power spectrum fourier ring correlation (saxton baumeister 1982). fourier ring correlation indicates resolution typical second order sofi images improved 30 per cent. results show careful parallelization dykstra's algorithm enables use large-scale statistical multi-resolution analyses.",15 "resolution unidentified words machine translation. paper presents mechanism resolving unidentified lexical units text-based machine translation (tbmt). machine translation (mt) system unlikely complete lexicon hence intense need new mechanism handle problem unidentified words. unknown words could abbreviations, names, acronyms newly introduced terms. proposed algorithm resolution unidentified words. algorithm takes discourse unit (primitive discourse) unit analysis provides real time updates lexicon. manually applied algorithm news paper fragments. along anaphora cataphora resolution, many unknown words especially names abbreviations updated lexicon.",4 "unified heuristic annotated bibliography large class earliness-tardiness scheduling problems. work proposes unified heuristic algorithm large class earliness-tardiness (e-t) scheduling problems. consider single/parallel machine e-t problems may may consider additional features idle time, setup times release dates. addition, also consider problems whose objective minimize either total (average) weighted completion time total (average) weighted flow time, arise particular cases due dates jobs either set zero associated release dates, respectively. developed local search based metaheuristic framework quite simple, time relies sophisticated procedures efficiently performing local search according characteristics problem. present efficient move evaluation approaches parallel machine problems generalize existing ones single machine problems. algorithm tested hundreds instances several e-t problems particular cases. results obtained show unified heuristic capable producing high quality solutions compared best ones available literature obtained specific methods. moreover, provide extensive annotated bibliography problems related considered work, indicate approach(es) used publication, also point characteristics problem(s) considered. beyond that, classify existing methods different categories better idea popularity type solution procedure.",4 "task specific visual saliency prediction memory augmented conditional generative adversarial networks. visual saliency patterns result variety factors aside image parsed, however existing approaches ignored these. address limitation, propose novel saliency estimation model leverages semantic modelling power conditional generative adversarial networks together memory architectures capture subject's behavioural patterns task dependent factors. make contributions aiming bridge gap bottom-up feature learning capabilities modern deep learning architectures traditional top-down hand-crafted features based methods task specific saliency modelling. conditional nature proposed framework enables us learn contextual semantics relationships among different tasks together, instead learning separately task. studies shed light novel application area generative adversarial networks, also emphasise importance task specific saliency modelling demonstrate plausibility fully capturing context via augmented memory architecture.",4 "learning flexible reusable locomotion primitives microrobot. design gaits robot locomotion daunting process requires significant expert knowledge engineering. process even challenging robots accurate physical model, compliant micro-scale robots. data-driven gait optimization provides automated alternative analytical gait design. paper, propose novel approach efficiently learn wide range locomotion tasks walking robots. approach formalizes locomotion contextual policy search task collect data, subsequently uses data learn multi-objective locomotion primitives used planning. proof-of-concept consider simulated hexapod modeled recently developed microrobot, thoroughly evaluate performance microrobot different tasks gaits. results validate proposed controller learning scheme single multi-objective locomotion tasks. moreover, experimental simulations show without prior knowledge robot used (e.g., dynamics model), approach capable learning locomotion primitives within 250 trials subsequently using successfully navigate maze.",4 "good arm identification via bandit feedback. consider novel stochastic multi-armed bandit problem called {\em good arm identification} (gai), good arm defined arm expected reward greater equal given threshold. gai pure-exploration problem single agent repeats process outputting arm soon identified good one confirming arms actually good. objective gai minimize number samples process. find gai faces new kind dilemma, {\em exploration-exploitation dilemma confidence}, different difficulty best arm identification. result, efficient design algorithms gai quite different best arm identification. derive lower bound sample complexity gai tight logarithmic factor $\mathrm{o}(\log \frac{1}{\delta})$ acceptance error rate $\delta$. also develop algorithm whose sample complexity almost matches lower bound. also confirm experimentally proposed algorithm outperforms naive algorithms synthetic settings based conventional bandit problem clinical trial researches rheumatoid arthritis.",19 "decision theoretic approach targeted advertising. simple advertising strategy used help increase sales product mail special offers selected potential customers. cost associated sending offer, optimal mailing strategy depends benefit obtained purchase offer affects buying behavior customers. paper, describe two methods partitioning potential customers groups, show perform simple cost-benefit analysis decide which, any, groups targeted. particular, consider two decision-tree learning algorithms. first ""off shelf"" algorithm used model probability groups customers buy product. second new algorithm similar first, except group, explicitly models probability purchase two mailing scenarios: (1) mail sent members group (2) mail sent members group. using data real-world advertising experiment, compare algorithms naive mail-to-all strategy.",4 "parallel training dnns natural gradient parameter averaging. describe neural-network training framework used kaldi speech recognition toolkit, geared towards training dnns large amounts training data using multiple gpu-equipped multi-core machines. order hardware-agnostic possible, needed way use multiple machines without generating excessive network traffic. method average neural network parameters periodically (typically every minute two), redistribute averaged parameters machines training. machine sees different data. itself, method work well. however, another method, approximate efficient implementation natural gradient stochastic gradient descent (ng-sgd), seems allow periodic-averaging method work well, well substantially improving convergence sgd single machine.",4 "learning causal graphs small interventions. consider problem learning causal networks interventions, intervention limited size pearl's structural equation model independent errors (sem-ie). objective minimize number experiments discover causal directions edges causal graph. previous work focused use separating systems complete graphs task. prove deterministic adaptive algorithm needs separating system order learn complete graphs worst case. addition, present novel separating system construction, whose size close optimal arguably simpler previous work combinatorics. also develop novel information theoretic lower bound number interventions applies full generality, including randomized adaptive learning algorithms. general chordal graphs, derive worst case lower bounds number interventions. building observations induced trees, give new deterministic adaptive algorithm learn directions chordal skeleton completely. worst case, achievable scheme $\alpha$-approximation algorithm $\alpha$ independence number graph. also show exist graph classes sufficient number experiments close lower bound. extreme, graph classes required number experiments multiplicatively $\alpha$ away lower bound. simulations, algorithm almost always performs close lower bound, approach based separating systems complete graphs significantly worse random chordal graphs.",4 "improving facial attribute prediction using semantic segmentation. attributes semantically meaningful characteristics whose applicability widely crosses category boundaries. particularly important describing recognizing concepts explicit training example given, \textit{e.g., zero-shot learning}. additionally, since attributes human describable, used efficient human-computer interaction. paper, propose employ semantic segmentation improve facial attribute prediction. core idea lies fact many facial attributes describe local properties. words, probability attribute appear face image far uniform spatial domain. build facial attribute prediction model jointly deep semantic segmentation network. harnesses localization cues learned semantic segmentation guide attention attribute prediction regions different attributes naturally show up. result approach, addition recognition, able localize attributes, despite merely access image level labels (weak supervision) training. evaluate proposed method celeba lfwa datasets achieve superior results prior arts. furthermore, show reverse problem, semantic face parsing improves facial attributes available. reaffirms need jointly model two interconnected tasks.",4 "disfluency detection using bidirectional lstm. introduce new approach disfluency detection using bidirectional long-short term memory neural network (blstm). addition word sequence, model takes input pattern match features developed reduce sensitivity vocabulary size training, lead improved performance word sequence alone. blstm takes advantage explicit repair states addition standard reparandum states. final output leverages integer linear programming incorporate constraints disfluency structure. experiments switchboard corpus, model achieves state-of-the-art performance standard disfluency detection task correction detection task. analysis shows model better detection non-repetition disfluencies, tend much harder detect.",4 "identity alignment noisy pixel removal. identity alignment models assume precisely annotated images manually. human labelling unrealistic large sized imagery data. detection models introduce varying amount noise hamper identity alignment performance. work, propose refine images removing undesired pixels. achieved learning eliminate less informative pixels identity alignment. end, formulate method automatically detecting removing identity class irrelevant pixels auto-detected bounding boxes. experiments validate benefits model improving identity alignment.",4 "bootstrapped adaptive threshold selection statistical model selection estimation. central goal neuroscience understand activity nervous system related features external world, features nervous system itself. common approach model neural responses weighted combination external features, vice versa. structure model weights provide insight neural representations. often, neural input-output relationships sparse, inputs contributing output. part account sparsity, structured regularizers incorporated model fitting optimization. however, imposing priors, structured regularizers make difficult interpret learned model parameters. here, investigate simple, minimally structured model estimation method accurate, unbiased estimation sparse models based bootstrapped adaptive threshold selection followed ordinary least-squares refitting (boats). extensive numerical investigations, show method often performs favorably compared l1 l2 regularizers. particular, variety model distributions noise levels, boats accurately recovers parameters sparse models, leading parsimonious explanations outputs. finally, apply method task decoding human speech production ecog recordings.",19 "highlighting objects interest image integrating saliency depth. stereo images captured primarily 3d reconstruction past. however, depth information acquired stereo also used along saliency highlight certain objects scene. approach used make still images interesting look at, highlight objects interest scene. introduce novel direction paper, discuss theoretical framework behind approach. even though use depth stereo work, approach applicable depth data acquired sensor modality. experimental results indoor outdoor scenes demonstrate benefits algorithm.",4 "embedded deep learning based word prediction. recent developments deep learning application language modeling led success tasks text processing, summarizing machine translation. however, deploying huge language models mobile device on-device keyboards poses computation bottle-neck due puny computation capacities. work propose embedded deep learning based word prediction method optimizes run-time memory also provides real time prediction environment. model size 7.40mb average prediction time 6.47 ms. improve existing methods word prediction terms key stroke savings word prediction rate.",4 "end-to-end video classification knowledge graphs. video understanding attracted much research attention especially since recent availability large-scale video benchmarks. paper, address problem multi-label video classification. first observe exists significant knowledge gap machines humans learn. is, current machine learning approaches including deep neural networks largely focus representations given data, humans often look beyond data hand leverage external knowledge make better decisions. towards narrowing gap, propose incorporate external knowledge graphs video classification. particular, unify traditional ""knowledgeless"" machine learning models knowledge graphs novel end-to-end framework. framework flexible work existing video classification algorithms including state-of-the-art deep models. finally, conduct extensive experiments largest public video dataset youtube-8m. results promising across board, improving mean average precision 2.9%.",4 "input warping bayesian optimization non-stationary functions. bayesian optimization proven highly effective methodology global optimization unknown, expensive multimodal functions. ability accurately model distributions functions critical effectiveness bayesian optimization. although gaussian processes provide flexible prior functions queried efficiently, various classes functions remain difficult model. one frequently occurring class non-stationary functions. optimization hyperparameters machine learning algorithms problem domain parameters often manually transformed priori, example optimizing ""log-space,"" mitigate effects spatially-varying length scale. develop methodology automatically learning wide family bijective transformations warpings input space using beta cumulative distribution function. extend warping framework multi-task bayesian optimization multiple tasks warped jointly stationary space. set challenging benchmark optimization tasks, observe inclusion warping greatly improves state-of-the-art, producing better results faster reliably.",19 "view-invariant recognition action style self-dissimilarity. self-similarity recently introduced measure inter-class congruence classification actions. herein, investigate dual problem intra-class dissimilarity classification action styles. introduce self-dissimilarity matrices discriminate actions performed different subjects regardless viewing direction camera parameters. investigate two frameworks using invariant style dissimilarity measures based principal component analysis (pca) fisher discriminant analysis (fda). extensive experiments performed ixmas dataset indicate remarkably good discriminant characteristics proposed invariant measures gender recognition video data.",4 "many languages, one parser. train one multilingual model dependency parsing use parse sentences several languages. parsing model uses (i) multilingual word clusters embeddings; (ii) token-level language information; (iii) language-specific features (fine-grained pos tags). input representation enables parser parse effectively multiple languages, also generalize across languages based linguistic universals typological similarities, making effective learn limited annotations. parser's performance compares favorably strong baselines range data scenarios, including target language large treebank, small treebank, treebank training.",4 "treeview: peeking deep neural networks via feature-space partitioning. advent highly predictive opaque deep learning models, become important ever understand explain predictions models. existing approaches define interpretability inverse complexity achieve interpretability cost accuracy. introduces risk producing interpretable misleading explanations. humans, prone engage kind behavior \cite{mythos}. paper, take step direction tackling problem interpretability without compromising model accuracy. propose build treeview representation complex model via hierarchical partitioning feature space, reveals iterative rejection unlikely class labels correct association predicted.",19 "correlation-based construction neighborhood edge features. motivated abstract notion low-level edge detector filters, propose simple method unsupervised feature construction based pairwise statistics features. first step, construct neighborhoods features regrouping features correlate. use subsets filters produce new neighborhood features. next, connect neighborhood features correlate, construct edge features subtracting correlated neighborhood features other. validate usefulness constructed features, ran adaboost.mh four multi-class classification problems. significant result test error 0.94% mnist algorithm essentially free image-specific priors. cifar-10 method suboptimal compared today's best deep learning techniques, nevertheless, show proposed method outperforms boosting raw pixels, also boosting haar filters.",4 "transfer deep learning low-resource chinese word segmentation novel neural network. recent studies shown effectiveness using neural networks chinese word segmentation. however, models rely large-scale data less effective low-resource datasets insufficient training data. propose transfer learning method improve low-resource word segmentation leveraging high-resource corpora. first, train teacher model high-resource corpora use learned knowledge initialize student model. second, weighted data similarity method proposed train student model low-resource data. experiment results show work significantly improves performance low-resource datasets: 2.3% 1.5% f-score pku ctb datasets. furthermore, paper achieves state-of-the-art results: 96.1%, 96.2% f-score pku ctb datasets.",4 "hybridization evolutionary algorithms. evolutionary algorithms good general problem solver suffer lack domain specific knowledge. however, problem specific knowledge added evolutionary algorithms hybridizing. interestingly, elements evolutionary algorithms hybridized. chapter, hybridization three elements evolutionary algorithms discussed: objective function, survivor selection operator parameter settings. objective function, existing heuristic function construct solution problem traditional way used. however, function embedded evolutionary algorithm serves generator new solutions. addition, objective function improved local search heuristics. new neutral selection operator developed capable deal neutral solutions, i.e. solutions different representation expose equal values objective function. aim operator directs evolutionary search new undiscovered regions search space. avoid wrong setting parameters control behavior evolutionary algorithm, self-adaptation used. finally, hybrid self-adaptive evolutionary algorithm applied two real-world np-hard problems: graph 3-coloring optimization markers clothing industry. extensive experiments shown hybridization improves results evolutionary algorithms lot. furthermore, impact particular hybridizations analyzed details well.",4 "reconstruction-based disentanglement pose-invariant face recognition. deep neural networks (dnns) trained large-scale datasets recently achieved impressive improvements face recognition. persistent challenge remains develop methods capable handling large pose variations relatively underrepresented training data. paper presents method learning feature representation invariant pose, without requiring extensive pose coverage training data. first propose generate non-frontal views single frontal face, order increase diversity training data preserving accurate facial details critical identity discrimination. next contribution seek rich embedding encodes identity features, well non-identity ones pose landmark locations. finally, propose new feature reconstruction metric learning explicitly disentangle identity pose, demanding alignment feature reconstructions various combinations identity pose features, obtained two images subject. experiments controlled in-the-wild face datasets, multipie, 300wlp profile view database cfp, show method consistently outperforms state-of-the-art, especially images large head pose variations. detail results resource referred https://sites.google.com/site/xipengcshomepage/iccv2017",4 "stacked transfer learning tropical cyclone intensity prediction. tropical cyclone wind-intensity prediction challenging task considering drastic changes climate patterns last decades. order develop robust prediction models, one needs consider different characteristics cyclones terms spatial temporal characteristics. transfer learning incorporates knowledge related source dataset compliment target datasets especially cases lack data. stacking form ensemble learning focused improving generalization recently used transfer learning problems referred transfer stacking. paper, employ transfer stacking means studying effects cyclones whereby evaluate cyclones different geographic locations helpful improving generalization performs. moreover, use conventional neural networks evaluating effects duration cyclones prediction performance. therefore, develop effective strategy evaluates relationships different types cyclones transfer learning conventional learning methods via neural networks.",4 "parallel markov chain monte carlo indian buffet process. indian buffet process based models elegant way discovering underlying features within data set, inference models slow. inferring underlying features using markov chain monte carlo either relies uncollapsed representation, leads poor mixing, collapsed representation, leads quadratic increase computational complexity. existing attempts distributing inference introduced additional approximation within inference procedure. paper present novel algorithm perform asymptotically exact parallel markov chain monte carlo inference indian buffet process models. take advantage fact features conditionally independent beta-bernoulli process. conditional independence, partition features two parts: one part containing finitely many instantiated features part containing infinite tail uninstantiated features. finite partition, parallel inference simple given instantiation features. infinite tail, performing uncollapsed mcmc leads poor mixing hence collapse features. resulting hybrid sampler, parallel, produces samples asymptotically true posterior.",19 "bubbleview: interface crowdsourcing image importance maps tracking visual attention. paper, present bubbleview, alternative methodology eye tracking using discrete mouse clicks measure information people consciously choose examine. bubbleview mouse-contingent, moving-window interface participants presented series blurred images click reveal ""bubbles"" - small, circular areas image original resolution, similar confined area focus like eye fovea. across 10 experiments 28 different parameter combinations, evaluated bubbleview variety image types: information visualizations, natural images, static webpages, graphic designs, compared clicks eye fixations collected eye-trackers controlled lab settings. found bubbleview clicks (i) successfully approximate eye fixations different images, (ii) used rank image design elements importance. bubbleview designed collect clicks static images, works best defined tasks describing content information visualization measuring image importance. bubbleview data cleaner consistent related methodologies use continuous mouse movements. analyses validate use mouse-contingent, moving-window methodologies approximating eye fixations different image task types.",4 "towards ontology-driven blockchain design supply chain provenance. interesting research problem age big data determining provenance. granular evaluation provenance physical goods--e.g. tracking ingredients pharmaceutical demonstrating authenticity luxury goods--has often possible today's items produced transported complex, inter-organizational, often internationally-spanning supply chains. recent adoption internet things blockchain technologies give promise better supply chain provenance. particularly interested blockchain many favoured use cases blockchain provenance tracking. also interested applying ontologies work done knowledge provenance, traceability, food provenance using ontologies. paper, make case ontologies contribute blockchain design. support case, analyze traceability ontology translate representations smart contracts execute provenance trace enforce traceability constraints ethereum blockchain platform.",4 "deep semantic classification 3d lidar data. robots expected operate autonomously dynamic environments. understanding underlying dynamic characteristics objects key enabler achieving goal. paper, propose method pointwise semantic classification 3d lidar data three classes: non-movable, movable dynamic. concentrate understanding specific semantics characterize important information required autonomous system. non-movable points scene belong unchanging segments environment, whereas remaining classes corresponds changing parts scene. difference movable dynamic class motion state. dynamic points perceived moving, whereas movable objects move, perceived static. learn distinction movable non-movable points environment, introduce approach based deep neural network detecting dynamic points, estimate pointwise motion. propose bayes filter framework combining learned semantic cues motion cues infer required semantic classification. extensive experiments, compare approach methods standard benchmark dataset report competitive results comparison existing state-of-the-art. furthermore, show improvement classification points combining semantic cues retrieved neural network motion cues.",4 "boosting neural machine translation. training efficiency one main problems neural machine translation (nmt). deep networks need large data well many training iterations achieve state-of-the-art performance. results high computation cost, slowing research industrialisation. paper, propose alleviate problem several training methods based data boosting bootstrap modifications neural network. imitates learning process humans, typically spend time learning ""difficult"" concepts easier ones. experiment english-french translation task showing accuracy improvements 1.63 bleu saving 20% training time.",4 "predictive coding-based deep dynamic neural network visuomotor learning. study presents dynamic neural network model based predictive coding framework perceiving predicting dynamic visuo-proprioceptive patterns. previous study [1], shown deep dynamic neural network model able coordinate visual perception action generation seamless manner. current study, extended previous model predictive coding framework endow model capability perceiving predicting dynamic visuo-proprioceptive patterns well capability inferring intention behind perceived visuomotor information minimizing prediction error. set synthetic experiments conducted robot learned imitate gestures another robot simulation environment. experimental results showed given intention states, model able mentally simulate possible incoming dynamic visuo-proprioceptive patterns top-down process without inputs external environment. moreover, results highlighted role minimizing prediction error inferring underlying intention perceived visuo-proprioceptive patterns, supporting predictive coding account mirror neuron systems. results also revealed minimizing prediction error one modality induced recall corresponding representation another modality acquired consolidative learning raw-level visuo-proprioceptive patterns.",4 "enhanced deep residual networks single image super-resolution. recent research super-resolution progressed development deep convolutional neural networks (dcnn). particular, residual learning techniques exhibit improved performance. paper, develop enhanced deep super-resolution network (edsr) performance exceeding current state-of-the-art sr methods. significant performance improvement model due optimization removing unnecessary modules conventional residual networks. performance improved expanding model size stabilize training procedure. also propose new multi-scale deep super-resolution system (mdsr) training method, reconstruct high-resolution images different upscaling factors single model. proposed methods show superior performance state-of-the-art methods benchmark datasets prove excellence winning ntire2017 super-resolution challenge.",4 "content-based image retrieval based late fusion binary local descriptors. one challenges content-based image retrieval (cbir) reduce semantic gaps low-level features high-level semantic concepts. cbir, images represented feature space performance cbir depends type selected feature representation. late fusion also known visual words integration applied enhance performance image retrieval. recent advances image retrieval diverted focus research towards use binary descriptors reported computationally efficient. paper, aim investigate late fusion fast retina keypoint (freak) scale invariant feature transform (sift). late fusion binary local descriptor selected among binary descriptors, freak shown good results classification-based problems sift robust translation, scaling, rotation small distortions. late fusion freak sift integrates performance feature descriptors effective image retrieval. experimental results comparisons show proposed late fusion enhances performances image retrieval.",4 "approximating continuous functions relu nets minimal width. article concerns expressive power depth deep feed-forward neural nets relu activations. specifically, answer following question: fixed $d_{in}\geq 1,$ minimal width $w$ neural nets relu activations, input dimension $d_{in}$, hidden layer widths $w,$ arbitrary depth approximate continuous, real-valued function $d_{in}$ variables arbitrarily well? turns minimal width exactly equal $d_{in}+1.$ is, hidden layer widths bounded $d_{in}$, even infinite depth limit, relu nets express limited class functions, and, hand, continuous function $d_{in}$-dimensional unit cube approximated arbitrary precision relu nets hidden layers width exactly $d_{in}+1.$ construction fact shows continuous function $f:[0,1]^{d_{in}}\to\mathbb r^{d_{out}}$ approximated net width $d_{in}+d_{out}$. obtain quantitative depth estimates approximation terms modulus continuity $f$.",19 "modular analysis adaptive (non-)convex optimization: optimism, composite objectives, variational bounds. recently, much work done extending scope online learning incremental stochastic optimization algorithms. paper contribute effort two ways: first, based new regret decomposition generalization bregman divergences, provide self-contained, modular analysis two workhorses online learning: (general) adaptive versions mirror descent (md) follow-the-regularized-leader (ftrl) algorithms. analysis done extra care introduce assumptions needed proofs allows combine, straightforward way, different algorithmic ideas (e.g., adaptivity, optimism, implicit updates) learning settings (e.g., strongly convex composite objectives). way able reprove, extend refine large body literature, keeping proofs concise. second contribution byproduct careful analysis: present algorithms improved variational bounds smooth, composite objectives, including new family optimistic md algorithms one projection step per round. furthermore, provide simple extension adaptive regret bounds practically relevant non-convex problem settings essentially extra effort.",4 "neural network approach context-sensitive generation conversational responses. present novel response generation system trained end end large quantities unstructured twitter conversations. neural network architecture used address sparsity issues arise integrating contextual information classic statistical models, allowing system take account previous dialog utterances. dynamic-context generative models show consistent gains context-sensitive non-context-sensitive machine translation information retrieval baselines.",4 "capacity trainability recurrent neural networks. two potential bottlenecks expressiveness recurrent neural networks (rnns) ability store information task parameters, store information input history units. show experimentally common rnn architectures achieve nearly per-task per-unit capacity bounds careful training, variety tasks stacking depths. store amount task information linear number parameters, approximately 5 bits per parameter. additionally store approximately one real number input history per hidden unit. find several tasks per-task parameter capacity bound determines performance. results suggest many previous results comparing rnn architectures driven primarily differences training effectiveness, rather differences capacity. supporting observation, compare training difficulty several architectures, show vanilla rnns far difficult train, yet slightly higher capacity. finally, propose two novel rnn architectures, one easier train lstm gru deeply stacked architectures.",19 "simple pairs points digital spaces. topology-preserving transformations digital spaces contracting simple pairs points. transformations digital spaces preserving local global topology play important role thinning, skeletonization simplification digital images. present paper, introduce study contractions simple pair points based notions digital contractible space contractible transformations digital spaces. show contraction simple pair points preserves local global topology digital space. relying obtained results, study properties digital manifolds. particular, show digital n-manifold transformed compressed form minimal number points sequential contractions simple pairs. key words: graph, digital space, contraction, splitting, simple pair, homotopy, thinning",4 "cross-media similarity evaluation web image retrieval wild. order retrieve unlabeled images textual queries, cross-media similarity computation key ingredient. although novel methods continuously introduced, little done evaluate methods together large-scale query log analysis. consequently, far methods brought us answering real-user queries unclear. given baseline methods compute cross-media similarity using relatively simple text/image matching, much progress advanced models made also unclear. paper takes pragmatic approach answering two questions. queries automatically categorized according proposed query visualness measure, later connected evaluation multiple cross-media similarity models three test sets. connection reveals success state-of-the-art mainly attributed good performance visual-oriented queries, queries account small part real-user queries. quantify current progress, propose simple text2image method, representing novel test query set images selected large-scale query log. consequently, computing cross-media similarity test query given image boils comparing visual similarity given image selected images. image retrieval experiments challenging clickture dataset show proposed text2image compares favorably recent deep learning based alternatives.",4 "koniq-10k: towards ecologically valid large-scale iqa database. main challenge applying state-of-the-art deep learning methods predict image quality in-the-wild relatively small size existing quality scored datasets. reason lack larger datasets massive resources required generating diverse publishable content. present new systematic scalable approach create large-scale, authentic diverse image datasets image quality assessment (iqa). show built iqa database, koniq-10k, consisting 10,073 images, performed large scale crowdsourcing experiments order obtain reliable quality ratings 1,467 crowd workers (1.2 million ratings). argue ecological validity analyzing diversity dataset, comparing state-of-the-art iqa databases, checking reliability user studies.",4 "momentum stochastic momentum stochastic gradient, newton, proximal point subspace descent methods. paper study several classes stochastic optimization algorithms enriched heavy ball momentum. among methods studied are: stochastic gradient descent, stochastic newton, stochastic proximal point stochastic dual subspace ascent. first time momentum variants several methods studied. choose perform analysis setting methods equivalent. prove global nonassymptotic linear convergence rates methods various measures success, including primal function values, primal iterates (in l2 sense), dual function values. also show primal iterates converge accelerated linear rate l1 sense. first time linear rate shown stochastic heavy ball method (i.e., stochastic gradient descent method momentum). somewhat weaker conditions, establish sublinear convergence rate cesaro averages primal iterates. moreover, propose novel concept, call stochastic momentum, aimed decreasing cost performing momentum step. prove linear convergence several stochastic methods stochastic momentum, show sparse data regimes sufficiently small momentum parameters, methods enjoy better overall complexity methods deterministic momentum. finally, perform extensive numerical testing artificial real datasets, including data coming average consensus problems.",12 "generalized topic modeling. recently significant activity developing algorithms provable guarantees topic modeling. standard topic models, topic (such sports, business, politics) viewed probability distribution $\vec a_i$ words, document generated first selecting mixture $\vec w$ topics, generating words i.i.d. associated mixture $a{\vec w}$. given large collection documents, goal recover topic vectors correctly classify new documents according topic mixture. work consider broad generalization framework words longer assumed drawn i.i.d. instead topic complex distribution sequences paragraphs. since one could hope even represent distribution general (even paragraphs given using natural feature representation), aim instead directly learn document classifier. is, aim learn predictor given new document, accurately predicts topic mixture, without learning distributions explicitly. present several natural conditions one efficiently discuss issues noise tolerance sample complexity model. generally, model viewed generalization multi-view co-training setting machine learning.",4 "oriented straight line segment algebra: qualitative spatial reasoning oriented objects. nearly 15 years ago, set qualitative spatial relations oriented straight line segments (dipoles) suggested schlieder. work received substantial interest amongst qualitative spatial reasoning community. however, turned difficult establish sound constraint calculus based relations. paper, present results new investigation dipole constraint calculi uses algebraic methods derive sound results composition relations properties dipole calculi. results based condensed semantics dipole relations. contrast points normally used, dipoles extended intrinsic direction. features important properties natural objects. allows straightforward representation prototypical reasoning tasks spatial agents. example, show generate survey knowledge local observations street network. example illustrates fast constraint-based reasoning capabilities dipole calculus. integrate results two reasoning tools publicly available.",4 "memory enriched big bang big crunch optimization algorithm data clustering. cluster analysis plays important role decision making process many knowledge-based systems. exist wide variety different approaches clustering applications including heuristic techniques, probabilistic models, traditional hierarchical algorithms. paper, novel heuristic approach based big bang-big crunch algorithm proposed clustering problems. proposed method takes advantage heuristic nature alleviate typical clustering algorithms k-means, also benefits memory based scheme compared similar heuristic techniques. furthermore, performance proposed algorithm investigated based several benchmark test functions well well-known datasets. experimental results show significant superiority proposed method similar algorithms.",4 "causal decision trees. uncovering causal relationships data major objective data analytics. causal relationships normally discovered designed experiments, e.g. randomised controlled trials, which, however expensive infeasible conducted many cases. causal relationships also found using well designed observational studies, require domain experts' knowledge process normally time consuming. hence need scalable automated methods causal relationship exploration data. classification methods fast could practical substitutes finding causal signals data. however, classification methods designed causal discovery classification method may find false causal signals miss true ones. paper, develop causal decision tree nodes causal interpretations. method follows well established causal inference framework makes use classic statistical test. method practical finding causal signals large data sets.",4 "one class classifier based framework using svdd : application imbalanced geological dataset. evaluation hydrocarbon reservoir requires classification petrophysical properties available dataset. however, characterization reservoir attributes difficult due nonlinear heterogeneous nature subsurface physical properties. context, present study proposes generalized one class classification framework based support vector data description (svdd) classify reservoir characteristic water saturation two classes (class high class low) four logs namely gamma ray, neutron porosity, bulk density, p sonic using imbalanced dataset. comparison carried among proposed framework different supervised classification algorithms terms g metric means execution time. experimental results show proposed framework outperformed classifiers terms performance evaluators. envisaged classification analysis performed study useful reservoir modeling.",4 "recurrent neural network postfilters statistical parametric speech synthesis. last two years, numerous papers looked using deep neural networks replace acoustic model traditional statistical parametric speech synthesis. however, far less attention paid approaches like dnn-based postfiltering dnns work conjunction traditional acoustic models. paper, investigate use recurrent neural networks potential postfilter synthesis. explore possibility replacing existing postfilters, well highlight ease arbitrary new features added input postfilter. also tried novel approach jointly training classification regression tree postfilter, rather traditional approach training independently.",4 "direct learning rank rerank. learning-to-rank techniques proven extremely useful prioritization problems, rank items order estimated probabilities, dedicate limited resources top-ranked items. work exposes serious problem state learning-to-rank algorithms, based convex proxies lead poor approximations. discuss possibility ""exact"" reranking algorithms based mathematical programming. prove relaxed version ""exact"" problem optimal solution, provide empirical analysis.",19 "linearized kernel dictionary learning. paper present new approach incorporating kernels dictionary learning. kernel k-svd algorithm (kksvd), introduced recently, shows improvement classification performance, relation linear counterpart k-svd. however, algorithm requires storage handling large kernel matrix, leads high computational cost, also limiting use setups small number training examples. address problems combining two ideas: first approximate kernel matrix using cleverly sampled subset columns using nystr\""{o}m method; secondly, wish avoid using matrix altogether, decompose svd form new ""virtual samples,"" linear dictionary learning employed. method, termed ""linearized kernel dictionary learning"" (lkdl) seamlessly applied pre-processing stage top efficient off-the-shelf dictionary learning scheme, effectively ""kernelizing"" it. demonstrate effectiveness method several tasks supervised unsupervised classification show efficiency proposed scheme, easy integration performance boosting properties.",4 "precision recall range-based anomaly detection. classical anomaly detection principally concerned point-based anomalies, anomalies occur single data point. paper, present new mathematical model express range-based anomalies, anomalies occur range (or period) time.",4 "application-oriented terminology evaluation: case back-of-the book indexes. paper addresses problem computational terminology evaluation per se specific application context. paper describes evaluation procedure used assess validity overall indexing approach quality inddoc indexing tool. even user-oriented extended evaluation irreplaceable, argue early evaluations possible useful development guidance.",4 "weather perception: joint data association, tracking, classification autonomous ground vehicles. novel probabilistic perception algorithm presented real-time joint solution data association, object tracking, object classification autonomous ground vehicle all-weather conditions. presented algorithm extends rao-blackwellized particle filter originally built particle filter data association kalman filter multi-object tracking (miller et al. 2011a) also include multiple model tracking classification. additionally state-of-the-art vision detection algorithm includes heading information autonomous ground vehicle (agv) applications implemented. cornell's agv darpa urban challenge upgraded used experimentally examine state-of-the-art vision algorithms complement replace lidar radar sensors. sensor algorithm performance adverse weather lighting conditions tested. experimental evaluation demonstrates robust all-weather data association, tracking, classification camera, lidar, radar sensors complement inside joint probabilistic perception algorithm.",4 "redundancy logic i: cnf propositional formulae. knowledge base redundant contains parts inferred rest it. study problem checking whether cnf formula (a set clauses) redundant, is, contains clauses derived ones. cnf formula made irredundant deleting clauses: results irredundant equivalent subset (i.e.s.) study complexity related problems: verification, checking existence i.e.s. given size, checking necessary possible presence clauses i.e.s.'s, uniqueness. also consider problem redundancy different definitions equivalence.",4 "novel tuneable method skin detection based hybrid color space color statistical features. skin detection one important primary stages image processing applications face detection human tracking. far, many approaches proposed done case. near methods tried find best match intensity distribution skin pixels based popular color spaces rgb, cmyk ycbcr. results show methods cannot provide accurate approach every kinds skin. paper, approach proposed solve problem using statistical features technique. approach including two stages. first one, pure skin statistical features extracted second stage, skin pixels detected using hsv ycbcr color spaces. result part, proposed approach applied fei database accuracy rate reached 99.25 + 0.2. proposed method applied complex background database accuracy rate obtained 95.40+0.31%. proposed approach used kinds skin using train stage main advantages it. low noise sensitivity low computational complexity advantages.",4 "universal variance reduction-based catalyst nonconvex low-rank matrix recovery. propose generic framework based new stochastic variance-reduced gradient descent algorithm accelerating nonconvex low-rank matrix recovery. starting appropriate initial estimator, proposed algorithm performs projected gradient descent based novel semi-stochastic gradient specifically designed low-rank matrix recovery. based upon mild restricted strong convexity smoothness conditions, derive projected notion restricted lipschitz continuous gradient property, prove algorithm enjoys linear convergence rate unknown low-rank matrix improved computational complexity. moreover, algorithm employed noiseless noisy observations, optimal sample complexity minimax optimal statistical rate attained respectively. illustrate superiority generic framework several specific examples, theoretically experimentally.",19 "informed heuristics guiding stem-and-cycle ejection chains. state art local search traveling salesman problem dominated ejection chain methods utilising stem-and-cycle reference structure. though effective algorithms employ little information successor selection strategy, typically seeking minimise cost move. propose alternative approach inspired ai literature show admissible heuristic used guide successor selection. undertake empirical analysis demonstrate technique often produces better results less informed strategies albeit cost running higher polynomial time.",4 finding approximate local minima faster gradient descent. design non-convex second-order optimization algorithm guaranteed return approximate local minimum time scales linearly underlying dimension number training examples. time complexity algorithm find approximate local minimum even faster gradient descent find critical point. algorithm applies general class optimization problems including training neural network non-convex objectives arising machine learning.,12 "projection onto probability simplex: efficient algorithm simple proof, application. provide elementary proof simple, efficient algorithm computing euclidean projection point onto probability simplex. also show application laplacian k-modes clustering.",4 "real-time web scale event summarization using sequential decision making. present system based sequential decision making online summarization massive document streams, found web. given event interest (e.g. ""boston marathon bombing""), system able filter stream relevance produce series short text updates describing event unfolds time. unlike previous work, approach able jointly model relevance, comprehensiveness, novelty, timeliness required time-sensitive queries. demonstrate 28.3% improvement summary f1 43.8% improvement time-sensitive f1 metrics.",4 "automated identification trampoline skills using computer vision extracted pose estimation. novel method identify trampoline skills using single video camera proposed herein. conventional computer vision techniques used identification, estimation, tracking gymnast's body video recording routine. frame, open source convolutional neural network used estimate pose athlete's body. body orientation joint angle estimates extracted pose estimates. trajectories angle estimates time compared labelled reference skills. nearest neighbour classifier utilising mean squared error distance metric used identify skill performed. dataset containing 714 skill examples 20 distinct skills performed adult male female gymnasts recorded used evaluation system. system found achieve skill identification accuracy 80.7% dataset.",4 "clustering side information: probabilistic model deterministic algorithm. paper, propose model-based clustering method (tvclust) robustly incorporates noisy side information soft-constraints aims seek consensus side information observed data. method based nonparametric bayesian hierarchical model combines probabilistic model data instance one side-information. efficient gibbs sampling algorithm proposed posterior inference. using small-variance asymptotics probabilistic model, derive new deterministic clustering algorithm (rdp-means). viewed extension k-means allows inclusion side information additional property number clusters need specified priori. empirical studies carried compare work many constrained clustering algorithms literature variety data sets variety conditions using noisy side information erroneous k values. results experiments show strong results probabilistic deterministic approaches conditions compared algorithms literature.",19 algorithm missing values imputation categorical data use association rules. paper presents algorithm missing values imputation categorical data. algorithm based using association rules presented three variants. experimental shows better accuracy missing values imputation using algorithm using common attribute value.,4 "automatic recognition mammal genera camera-trap images using multi-layer robust principal component analysis mixture neural networks. segmentation classification animals camera-trap images due conditions images taken, difficult task. work presents method classifying segmenting mammal genera camera-trap images. method uses multi-layer robust principal component analysis (rpca) segmenting, convolutional neural networks (cnns) extracting features, least absolute shrinkage selection operator (lasso) selecting features, artificial neural networks (anns) support vector machines (svm) classifying mammal genera present colombian forest. evaluated method camera-trap images alexander von humboldt biological resources research institute. obtained accuracy 92.65% classifying 8 mammal genera false positive (fp) class, using automatic-segmented images. hand, reached 90.32% accuracy classifying 10 mammal genera, using ground-truth images only. unlike almost previous works, confront animal segmentation genera classification camera-trap recognition. method shows new approach toward fully-automatic detection animals camera-trap images.",4 "visual-hint boundary segment algorithm image segmentation. image segmentation active research topic image analysis area. currently, image segmentation algorithms designed based idea images partitioned set regions preserving homogeneous intra-regions inhomogeneous inter-regions. however, human visual intuition always follow pattern. new image segmentation method named visual-hint boundary segment (vhbs) introduced, consistent human perceptions. vhbs abides two visual hint rules based human perceptions: (i) global scale boundaries tend real boundaries objects; (ii) two adjacent regions quite different colors textures tend result real boundaries them. demonstrated experiments that, compared traditional image segmentation method, vhbs better performance also preserves higher computational efficiency.",4 "hyperparameter optimization boosting classifying facial expressions: good ""null"" model be?. one goals icml workshop representation learning establish benchmark scores new data set labeled facial expressions. paper presents performance ""null"" model consisting convolutions random weights, pca, pooling, normalization, linear readout. approach focused hyperparameter optimization rather novel model components. facial expression recognition challenge held kaggle website, hyperparameter optimization approach achieved score 60% accuracy test data. paper also introduces new ensemble construction variant combines hyperparameter optimization construction ensembles. algorithm constructed ensemble four models scored 65.5% accuracy. scores rank 12th 5th respectively among 56 challenge participants. worth noting approach developed prior release data set, applied without modification; strong competition performance suggests tpe hyperparameter optimization algorithm domain expertise encoded null model generalize new image classification data sets.",4 "vector space model cognitive space text classification. era digitization, knowing user's sociolect aspects become essential features build user specific recommendation systems. sociolect aspects could found mining user's language sharing form text social media reviews. paper describes experiment performed pan author profiling 2017 shared task. objective task find sociolect aspects users tweets. sociolect aspects considered experiment user's gender native language information. user's tweets written different language native language represented document - term matrix document frequency constraint. classification done using support vector machine taking gender native language target classes. experiment attains average accuracy 73.42% gender prediction 76.26% native language identification task.",4 "adversarial extreme multi-label classification. goal extreme multi-label classification learn classifier assign small subset relevant labels instance extremely large set target labels. datasets extreme classification exhibit long tail labels small number positive training instances. work, pose learning task extreme classification large number tail-labels learning presence adversarial perturbations. view motivates robust optimization framework equivalence corresponding regularized objective. proposed robustness framework, demonstrate efficacy hamming loss tail-label detection extreme classification. equivalent regularized objective, combination proximal gradient based optimization, performs better state-of-the-art methods propensity scored versions precision@k ndcg@k(upto 20% relative improvement pfastrexml - leading tree-based approach 60% relative improvement sleec - leading label-embedding approach). furthermore, also highlight sub-optimality sparse solver widely used package large-scale linear classification, interesting right. also investigate spectral properties label graphs providing novel insights towards understanding conditions governing performance hamming loss based one-vs-rest scheme vis-\`a-vis label embedding methods.",19 "paradigm shift: detecting human rights violations web images. growing presence devices carrying digital cameras, mobile phones tablets, combined ever improving internet networks enabled ordinary citizens, victims human rights abuse, participants armed conflicts, protests, disaster situations capture share via social media networks images videos specific events. paper discusses potential images human rights context including opportunities challenges present. study demonstrates real-world images capacity contribute complementary data operational human rights monitoring efforts combined novel computer vision approaches. analysis concluded arguing images used effectively detect identify human rights violations rights advocates, greater attention gathering task-specific visual concepts large-scale web images required.",4 "dendritic error backpropagation deep cortical microcircuits. animal behaviour depends learning associate sensory stimuli desired motor command. understanding brain orchestrates necessary synaptic modifications across different brain areas remained longstanding puzzle. here, introduce multi-area neuronal network model synaptic plasticity continuously adapts network towards global desired output. model synaptic learning driven local dendritic prediction error arises failure predict top-down input given bottom-up activities. errors occur apical dendrites pyramidal neurons long-range excitatory feedback local inhibitory predictions integrated. local inhibition fails match excitatory feedback error occurs triggers plasticity bottom-up synapses basal dendrites pyramidal neurons. demonstrate learning capabilities model number tasks show approximates classical error backpropagation algorithm. finally, complementing cortical circuit disinhibitory mechanism enables attention-like stimulus denoising generation. framework makes several experimental predictions function dendritic integration cortical microcircuits, consistent recent observations cross-area learning, suggests biological implementation deep learning.",16 "content selection data-to-text systems: survey. data-to-text systems powerful generating reports data automatically thus simplify presentation complex data. rather presenting data using visualisation techniques, data-to-text systems use natural (human) language, common way human-human communication. addition, data-to-text systems adapt output content users' preferences, background interests therefore pleasant users interact with. content selection important part every data-to-text system, module determines available information conveyed user. survey initially introduces field data-to-text generation, describes general data-to-text system architecture reviews state-of-the-art content selection methods. finally, provides recommendations choosing approach discusses opportunities future research.",4 "logic reasoning evidence. introduce logic reasoning evidence, essentially views evidence function prior beliefs (before making observation) posterior beliefs (after making observation). provide sound complete axiomatization logic, consider complexity decision problem. although reasoning logic mainly propositional, allow variables representing numbers quantification them. expressive power seems necessary capture important properties evidence",4 "methodology analyze accuracy 3d objects reconstructed collaborative robot based monocular lsd-slam. slam systems mainly applied robot navigation research feasibility motion planning slam tasks like bin-picking, scarce. accurate 3d reconstruction objects environments important planning motion computing optimal gripper pose grasp objects. work, propose methods analyze accuracy 3d environment reconstructed using lsd-slam system monocular camera mounted onto gripper collaborative robot. discuss propose solution pose space conversion problem. finally, present several criteria analyze 3d reconstruction accuracy. could used guidelines improve accuracy 3d reconstructions monocular lsd-slam slam based solutions.",4 "absent-minded driver problem redux. paper reconsiders problem absent-minded driver must choose alternatives different payoff imperfect recall varying degrees knowledge system. classical absent-minded driver problem represents case limited information bearing general area communication learning, social choice, mechanism design, auctions, theories knowledge, belief, rational agency. within framework extensive games, problem applications many artificial intelligence scenarios. obvious performance agent improves information available increases. shown non-uniform assignment strategy successive choices better fixed probability strategy. consider classical quantum approaches problem. argue superior performance quantum decisions access entanglement cannot fairly compared classical algorithm. cognitive systems agents taken access quantum resources, quantum mechanical basis, leveraged superior performance.",4 "hypothesis testing using pairwise distances associated kernels (with appendix). provide unifying framework linking two classes statistics used two-sample independence testing: one hand, energy distances distance covariances statistics literature; other, distances embeddings distributions reproducing kernel hilbert spaces (rkhs), established machine learning. equivalence holds energy distances computed semimetrics negative type, case kernel may defined rkhs distance distributions corresponds exactly energy distance. determine class probability distributions kernels induced semimetrics characteristic (that is, embeddings distributions rkhs injective). finally, investigate performance family kernels two-sample independence tests: show particular energy distance commonly employed statistics one member parametric family kernels, choices family yield powerful tests.",4 "deep cnn based feature extractor text-prompted speaker recognition. deep learning still common tool speaker verification field. study deep convolutional neural network performance text-prompted speaker verification task. prompted passphrase segmented word states - i.e. digits -to test digit utterance separately. train single high-level feature extractor states use cosine similarity metric scoring. key feature network max-feature-map activation function, acts embedded feature selector. using multitask learning scheme train high-level feature extractor able surpass classic baseline systems terms quality achieved impressive results novice approach, getting 2.85% eer rsr2015 evaluation set. fusion proposed baseline systems improves result.",6 "interactive graphics visually diagnosing forest classifiers r. paper describes structuring data constructing plots explore forest classification models interactively. forest classifier example ensemble, produced bagging multiple trees. process bagging combining results multiple trees, produces numerous diagnostics which, interactive graphics, provide lot insight class structure high dimensions. various aspects explored paper, assess model complexity, individual model contributions, variable importance dimension reduction, uncertainty prediction associated individual observations. ideas applied random forest algorithm, projection pursuit forest, could broadly applied bagged ensembles. interactive graphics built r, using ggplot2, plotly, shiny packages.",19 "deep adversarial neural decoding. here, present novel approach solve problem reconstructing perceived stimuli brain responses combining probabilistic inference deep learning. approach first inverts linear transformation latent features brain responses maximum posteriori estimation inverts nonlinear transformation perceived stimuli latent features adversarial training convolutional neural networks. test approach functional magnetic resonance imaging experiment show generate state-of-the-art reconstructions perceived faces brain activations.",16 "rows vs columns linear systems equations - randomized kaczmarz coordinate descent?. paper randomized iterative algorithms solving linear system equations $x \beta = y$ different settings. recent interest topic reignited strohmer vershynin (2009) proved linear convergence rate randomized kaczmarz (rk) algorithm works rows $x$ (data points). following that, leventhal lewis (2010) proved linear convergence randomized coordinate descent (rcd) algorithm works columns $x$ (features). aim paper simplify understanding two algorithms, establish direct relationships (though rk often compared stochastic gradient descent), examine algorithmic commonalities tradeoffs involved working rows columns. also discuss kernel ridge regression present kaczmarz-style algorithm works data points advantage solving problem without ever storing forming gram matrix, one recognized problems encountered scaling kernelized methods.",12 "examining cooperation visual dialog models. work propose blackbox intervention method visual dialog models, aim assessing contribution individual linguistic visual components. concretely, conduct structured randomized interventions aim impair individual component model, observe changes task performance. reproduce state-of-the-art visual dialog model demonstrate methodology yields surprising insights, namely dialog image information minimal contributions task performance. intervention method presented applied sanity check strength robustness component visual dialog systems.",4 "learning optimal forecast aggregation partial evidence environments. consider forecast aggregation problem repeated settings, forecasts done binary event. period multiple experts provide forecasts event. goal aggregator aggregate forecasts subjective accurate forecast. assume experts bayesian; namely share common prior, expert exposed evidence, expert applies bayes rule deduce forecast. aggregator ignorant respect information structure (i.e., distribution evidence) according experts make prediction. aggregator observes experts' forecasts only. end period actual state realized. focus question whether aggregator learn aggregate optimally forecasts experts, optimal aggregation bayesian aggregation takes account information (evidence) system. consider class partial evidence information structures, expert exposed different subset conditionally independent signals. main results positive; show optimal aggregation learned polynomial time quite wide range instances partial evidence environments. provide tight characterization instances learning possible impossible.",4 "solving cooperative reliability games. cooperative games model allocation profit joint actions, following considerations stability fairness. propose reliability extension games, agents may fail participate game. reliability extension, agent ""survives"" certain probability, coalition's value probability surviving members would winning coalition base game. study prominent solution concepts games, showing approximate shapley value compute core games agent types. also show applying reliability extension may stabilize game, making core non-empty even base game empty core.",4 "machine learning drug overdose surveillance. describe two recently proposed machine learning approaches discovering emerging trends fatal accidental drug overdoses. gaussian process subset scan enables early detection emerging patterns spatio-temporal data, accounting non-iid nature data fact detecting subtle patterns requires integration information across multiple spatial areas multiple time steps. apply approach 17 years county-aggregated data monthly opioid overdose deaths new york city metropolitan area, showing clear advantages utility discovered patterns compared typical anomaly detection approaches. detect characterize emerging overdose patterns differentially affect subpopulation data, including geographic, demographic, behavioral patterns (e.g., combinations drugs involved), apply multidimensional tensor scan 8 years case-level overdose data allegheny county, pa. discover previously unidentified overdose patterns reveal unusual demographic clusters, show impacts drug legislation, demonstrate potential early detection targeted intervention. approaches early detection overdose patterns inform prevention response efforts, well understanding effects policy changes.",4 "modeling vagueness uncertainty data-to-text systems fuzzy sets. vagueness uncertainty management counted among one challenges remain unresolved systems generate texts non-linguistic data, known data-to-text systems. last decade, work fuzzy linguistic summarization description data raised interest using fuzzy sets model manage imprecision human language data-to-text systems. however, despite research direction, actual clear discussion justification fuzzy sets contribute data-to-text modeling vagueness uncertainty words expressions. paper intends bridge gap answering following questions: vagueness mean fuzzy sets theory? vagueness mean data-to-text contexts? ways fuzzy sets theory contribute improve data-to-text systems? challenges researchers disciplines need address successful integration fuzzy sets data-to-text systems? cases use fuzzy sets avoided d2t? this, review discuss state art vagueness modeling natural language generation data-to-text, describe potential actual usages fuzzy sets data-to-text contexts, provide additional insights engineering data-to-text systems make use fuzzy set-based techniques.",4 "user modelling avoiding overfitting interactive knowledge elicitation prediction. human-in-the-loop machine learning, user provides information beyond training data. many algorithms user interfaces designed optimize facilitate human--machine interaction; however, fewer studies addressed potential defects designs cause. effective interaction often requires exposing user training data statistics. design system critical, lead double use data overfitting, user reinforces noisy patterns data. propose user modelling methodology, assuming simple rational behaviour, correct problem. show, user study 48 participants, method improves predictive performance sparse linear regression sentiment analysis task, graded user knowledge feature relevance elicited. believe key idea inferring user knowledge probabilistic user models general applicability guarding overfitting improving interactive machine learning.",4 "classification constrained dimensionality reduction. dimensionality reduction topic recent interest. paper, present classification constrained dimensionality reduction (ccdr) algorithm account label information. algorithm account multiple classes well semi-supervised setting. present out-of-sample expressions labeled unlabeled data. unlabeled data, introduce method embedding new point preprocessing classifier. labeled data, introduce method improves embedding training phase using out-of-sample extension. investigate classification performance using ccdr algorithm hyper-spectral satellite imagery data. demonstrate performance gain local global classifiers demonstrate 10% improvement $k$-nearest neighbors algorithm performance. present connection intrinsic dimension estimation optimal embedding dimension obtained using ccdr algorithm.",19 learning elm network weights using linear discriminant analysis. present alternative pseudo-inverse method determining hidden output weight values extreme learning machines performing classification tasks. method based linear discriminant analysis provides bayes optimal single point estimates weight values.,4 "characterisation (sub)sequential rational functions general class monoids. technical report describe general class monoids (sub)sequential rational characterised terms congruence relation flavour myhill-nerode relation. class monoids consider described terms natural algebraic axioms, contains free monoids, groups, tropical monoid, closed cartesian.",4 "bidirectional-convolutional lstm based spectral-spatial feature learning hyperspectral image classification. paper proposes novel deep learning framework named bidirectional-convolutional long short term memory (bi-clstm) network automatically learn spectral-spatial feature hyperspectral images (hsis). network, issue spectral feature extraction considered sequence learning problem, recurrent connection operator across spectral domain used address it. meanwhile, inspired widely used convolutional neural network (cnn), convolution operator across spatial domain incorporated network extract spatial feature. besides, sufficiently capture spectral information, bidirectional recurrent connection proposed. classification phase, learned features concatenated vector fed softmax classifier via fully-connected operator. validate effectiveness proposed bi-clstm framework, compare several state-of-the-art methods, including cnn framework, three widely used hsis. obtained results show bi-clstm improve classification performance compared methods.",4 "flexible iterative framework consensus clustering. novel framework consensus clustering presented ability determine number clusters final solution using multiple algorithms. consensus similarity matrix formed ensemble using multiple algorithms several values k. variety dimension reduction techniques clustering algorithms considered analysis. noisy high-dimensional data, iterative technique presented refine consensus matrix way encourages algorithms agree upon common solution. utilize theory nearly uncoupled markov chains determine number, k , clusters dataset considering random walk graph defined consensus matrix. eigenvalues associated transition probability matrix used determine number clusters. method succeeds determining number clusters many datasets previous methods fail. every considered dataset, consensus method provides final result accuracy well average individual algorithms.",19 "watergan: unsupervised generative network enable real-time color correction monocular underwater images. paper reports watergan, generative adversarial network (gan) generating realistic underwater images in-air image depth pairings unsupervised pipeline used color correction monocular underwater images. cameras onboard autonomous remotely operated vehicles capture high resolution images map seafloor, however, underwater image formation subject complex process light propagation water column. raw images retrieved characteristically different images taken air due effects absorption scattering, cause attenuation light different rates different wavelengths. physical process well described theoretically, model depends many parameters intrinsic water column well objects scene. factors make recovery parameters difficult without simplifying assumptions field calibration, hence, restoration underwater images non-trivial problem. deep learning demonstrated great success modeling complex nonlinear systems requires large amount training data, difficult compile deep sea environments. using watergan, generate large training dataset paired imagery, raw underwater true color in-air, well depth data. data serves input novel end-to-end network color correction monocular underwater images. due depth-dependent water column effects inherent underwater environments, show end-to-end network implicitly learns coarse depth estimate underwater scene monocular underwater images. proposed pipeline validated testing real data collected pure water tank underwater surveys field testing. source code made publicly available sample datasets pretrained models.",4 "mag: multilingual, knowledge-base agnostic deterministic entity linking approach. entity linking recently subject significant body research. currently, best performing approaches rely trained mono-lingual models. porting approaches languages consequently difficult endeavor requires corresponding training data retraining models. address drawback presenting novel multilingual, knowledge-based agnostic deterministic approach entity linking, dubbed mag. mag based combination context-based retrieval structured knowledge bases graph algorithms. evaluate mag 23 data sets 7 languages. results show best approach trained english datasets (pboh) achieves micro f-measure 4 times worse datasets languages. mag, hand, achieves state-of-the-art performance english datasets reaches micro f-measure 0.6 higher pboh non-english languages.",4 "face synthesis (fasy) system generation face image human description. paper aims generating new face based human like description using new concept. fasy (face synthesis) system face database retrieval new face generation system development. one main features generation requested face found existing database, allows continuous growing database also.",4 "watersheds edge node weighted graphs ""par l'exemple"". watersheds defined node edge weighted graphs. show identical: edge (resp.\ node) weighted graph exists node (resp. edge) weighted graph minima catchment basin.",4 "humans deep networks largely agree kinds variation make object recognition harder. view-invariant object recognition challenging problem, attracted much attention among psychology, neuroscience, computer vision communities. humans notoriously good it, even variations presumably difficult handle others (e.g. 3d rotations). humans thought solve problem hierarchical processing along ventral stream, progressively extracts invariant visual features. feed-forward architecture inspired new generation bio-inspired computer vision systems called deep convolutional neural networks (dcnn), currently best algorithms object recognition natural images. here, first time, systematically compared human feed-forward vision dcnns view-invariant object recognition using images controlling kinds transformation well magnitude. used four object categories images rendered 3d computer models. total, 89 human subjects participated 10 experiments discriminate two four categories rapid presentation backward masking. also tested two recent dcnns tasks. found humans dcnns largely agreed relative difficulties kind variation: rotation depth far hardest transformation handle, followed scale, rotation plane, finally position. suggests humans recognize objects mainly 2d template matching, rather constructing 3d object models, dcnns unreasonable models human feed-forward vision. also, results show variation levels rotation depth scale strongly modulate humans' dcnns' recognition performances. thus argue variations controlled image datasets used vision research.",4 "joint estimation multiple graphical models high dimensional time series. manuscript consider problem jointly estimating multiple graphical models high dimensions. assume data collected n subjects, consists possibly dependent observations. graphical models subjects vary, assumed change smoothly corresponding measure closeness subjects. propose kernel based method jointly estimating graphical models. theoretically, double asymptotic framework, (t,n) dimension increase, provide explicit rate convergence parameter estimation. characterizes strength one borrow across different individuals impact data dependence parameter estimation. empirically, experiments synthetic real resting state functional magnetic resonance imaging (rs-fmri) data illustrate effectiveness proposed method.",19 "robust optical flow estimation rainy scenes. optical flow estimation rainy scenes challenging due background degradation introduced rain streaks rain accumulation effects scene. rain accumulation effect refers poor visibility remote objects due intense rainfall. existing optical flow methods erroneous applied rain sequences conventional brightness constancy constraint (bcc) gradient constancy constraint (gcc) generally break situation. based observation rgb color channels receive raindrop radiance equally, introduce residue channel new data constraint reduce effect rain streaks. handle rain accumulation, method decomposes image piecewise-smooth background layer high-frequency detail layer. also enforces bcc background layer only. results synthetic dataset real images show algorithm outperforms existing methods different types rain sequences. knowledge, first optical flow method specifically dealing rain.",4 "quantum mechanical approach modelling reliability sensor reports. dempster-shafer evidence theory wildly applied multi-sensor data fusion. however, lots uncertainty interference exist practical situation, especially battle field. still open issue model reliability sensor reports. many methods proposed based relationship among collected data. letter, proposed quantum mechanical approach evaluate reliability sensor reports, based properties sensor itself. proposed method used modify combining evidences.",4 "weakly supervised plda training. plda popular normalization approach i-vector model, delivered state-of-the-art performance speaker verification. however, plda training requires large amount labelled development data, highly expensive cases. present cheap plda training approach, assumes speakers session easily separated, speakers different sessions simply different. results `weak labels' fully accurate cheap, leading weak plda training. experimental results real-life large-scale telephony customer service achieves demonstrated weak training offer good performance human-labelled data limited. interestingly, weak training employed discriminative adaptation approach, efficient prevailing unsupervised method human-labelled data insufficient.",4 "non-simplifying graph rewriting termination. far, large amount work natural language processing (nlp) rely trees core mathematical structure represent linguistic informations (e.g. chomsky's work). however, linguistic phenomena cope properly trees. former paper, showed benefit encoding linguistic structures graphs using graph rewriting rules compute structures. justified linguistic considerations, graph rewriting characterized two features: first, node creation along computations second, non-local edge modifications. hypotheses, show uniform termination undecidable non-uniform termination decidable. describe two termination techniques based weights give complexity bound derivation length rewriting system.",4 "compressed sensing using generative models. goal compressed sensing estimate vector underdetermined system noisy linear measurements, making use prior knowledge structure vectors relevant domain. almost results literature, structure represented sparsity well-chosen basis. show achieve guarantees similar standard compressed sensing without employing sparsity all. instead, suppose vectors lie near range generative model $g: \mathbb{r}^k \to \mathbb{r}^n$. main theorem that, $g$ $l$-lipschitz, roughly $o(k \log l)$ random gaussian measurements suffice $\ell_2/\ell_2$ recovery guarantee. demonstrate results using generative models published variational autoencoder generative adversarial networks. method use $5$-$10$x fewer measurements lasso accuracy.",19 "properties sparse distributed representations application hierarchical temporal memory. empirical evidence demonstrates every region neocortex represents information using sparse activity patterns. paper examines sparse distributed representations (sdrs), primary information representation strategy hierarchical temporal memory (htm) systems neocortex. derive number properties core scaling, robustness, generalization. use theory provide practical guidelines illustrate power sdrs basis htm. goal help create unified mathematical practical framework sdrs relates cortical function.",16 "aorta segmentation stent simulation. simulation arterial stenting procedures prior intervention allows appropriate device selection well highlights potential complications. end, present framework facilitating virtual aortic stenting contrast computer tomography (ct) scan. specifically, present method lumen outer wall segmentation may employed determining appropriateness intervention well selection localization device. challenging recovery outer wall based novel minimal closure tracking algorithm. aortic segmentation method validated 3000 multiplanar reformatting (mpr) planes 50 ct angiography data sets yielding dice similarity coefficient (dsc) 90.67%.",4 "framework genetic algorithms based hadoop. genetic algorithms (gas) powerful metaheuristic techniques mostly used many real-world applications. sequential execution gas requires considerable computational power time resources. nevertheless, gas naturally parallel accessing parallel platform cloud easy cheap. apache hadoop one common services used parallel applications. however, using hadoop develop parallel version gas simple without facing inner workings. even though sequential frameworks gas already exist, framework supporting development ga applications executed parallel. paper described framework parallel gas hadoop platform, following paradigm mapreduce. main purpose framework allow user focus aspects ga specific problem addressed, sure task going correctly executed cloud good performance. framework also exploited develop application feature subset selection problem. preliminary analysis performance developed ga application performed using three datasets shown promising performance.",4 "speaker recognition children's speech. paper presents results speaker recognition (sr) children's speech, using ogi kids corpus gmm-ubm gmm-svm sr systems. regions spectrum containing important speaker information children identified conducting sr experiments 21 frequency bands. adults, spectrum split four regions, first (containing primary vocal tract resonance information) third (corresponding high frequency speech sounds) useful sr. however, frequencies regions occur 11% 38% higher children. also noted subband sr rates lower younger children. finally results presented sr experiments identify child class (30 children, similar age) school (288 children, varying ages). class performance depends age, accuracy varying 90% young children 99% older children. identification rate achieved child school 81%.",4 "deep learning-based food calorie estimation method dietary assessment. obesity treatment requires obese patients record food intakes per day. computer vision introduced estimate calories food images. order increase accuracy detection reduce error volume estimation food calorie estimation, present calorie estimation method paper. estimate calorie food, top view side view needed. faster r-cnn used detect food calibration object. grabcut algorithm used get food's contour. volume estimated food corresponding object. finally estimate food's calorie. experiment results show estimation method effective.",4 "representing human machine dictionaries markup languages. chapter present main issues representing machine readable dictionaries xml, particular according text encoding dictionary (tei) guidelines.",4 "deep multi-view learning stochastic decorrelation loss. multi-view learning aims learn embedding space multiple views either maximally correlated cross-view recognition, decorrelated latent factor disentanglement. key challenge deep multi-view representation learning scalability. correlate decorrelate multi-view signals, covariance whole training set computed fit well mini-batch based training strategy, moreover (de)correlation done way free svd-based computation order scale contemporary layer sizes. work, unified approach proposed efficient scalable deep multi-view learning. specifically, mini-batch based stochastic decorrelation loss (sdl) proposed applied network layer provide soft decorrelation layer's activations. reveals connection deep multi-view learning models deep canonical correlation analysis (dcca) factorisation autoencoder (fae), allows easily implemented. show sdl superior decorrelation losses terms efficacy scalability.",4 "application multiview techniques nhanes dataset. disease prediction classification using health datasets involve using well-known predictors associated disease features models. study considers multiple data components individual's health, using relationship variables generate features may improve performance disease classification models. order capture information different aspects data, project uses multiview learning approach, using canonical correlation analysis (cca), technique finds projections maximum correlations two data views. data categories collected nhanes survey (1999-2014) used views learn multiview representations. usefulness representations demonstrated applying features diabetes classification task.",4 "tumor classification segmentation mr brain images. diagnosis segmentation tumors using medical diagnostic tool challenging due varying nature pathology. magnetic reso- nance imaging (mri) established diagnostic tool various diseases disorders plays major role clinical neuro-diagnosis. supplementing technique automated classification segmentation tools gaining importance, reduce errors time needed make conclusive diagnosis. paper simple three-step algorithm proposed; (1) identification patients present tumors, (2) automatic selection abnormal slices patients, (3) segmentation detection tumor. features extracted using discrete wavelet transform normalized images classified support vector machine (for step (1)) random forest (for step (2)). 400 subjects divided 3:1 ratio training test overlap. study novel terms use data, employed entire t2 weighted slices single image classification unique combination contralateral approach patch thresholding segmentation, require training set template used segmentation studies. using proposed method, tumors segmented accurately classification accuracy 95% 100% specificity 90% sensitivity.",4 "multispectral image denoising optimized vector non-local mean filter. nowadays, many applications rely images high quality ensure good performance conducting tasks. however, noise goes objective unavoidable issue applications. therefore, essential develop techniques attenuate impact noise, maintaining integrity relevant information images. propose work extend application non-local means filter (nlm) vector case apply denoising multispectral images. objective benefit additional information brought multispectral imaging systems. nlm filter exploits redundancy information image remove noise. restored pixel weighted average pixels image. contribution, propose optimization framework dynamically fine tune nlm filter parameters attenuate computational complexity considering pixels similar computing restored pixel. filter parameters optimized using stein's unbiased risk estimator (sure) rather using ad hoc means. experiments conducted multispectral images corrupted additive white gaussian noise psnr similarity comparison approaches provided illustrate efficiency approach terms denoising performance computation complexity.",4 "variation word frequencies russian literary texts. study variation word frequencies russian literary texts. findings indicate standard deviation word's frequency across texts depends average frequency according power law exponent $0.62,$ showing rarer words relatively larger degree frequency volatility (i.e., ""burstiness""). several latent factors models estimated investigate structure word frequency distribution. dependence word's frequency volatility average frequency explained asymmetry distribution latent factors.",4 "semantic classifier approach document classification. paper propose new document classification method, bridging discrepancies (so-called semantic gap) training set application sets textual data. demonstrate superiority classical text classification approaches, including traditional classifier ensembles. method consists combining document categorization technique single classifier classifier ensemble (semcom algorithm - committee semantic categorizer).",4 "building fast compact convolutional neural networks offline handwritten chinese character recognition. like problems computer vision, offline handwritten chinese character recognition (hccr) achieved impressive results using convolutional neural network (cnn)-based methods. however, larger deeper networks needed deliver state-of-the-art results domain. networks intuitively appear incur high computational cost, require storage large number parameters, renders unfeasible deployment portable devices. solve problem, propose global supervised low-rank expansion (gslre) method adaptive drop-weight (adw) technique solve problems speed storage capacity. design nine-layer cnn hccr consisting 3,755 classes, devise algorithm reduce networks computational cost nine times compress network 1/18 original size baseline model, 0.21% drop accuracy. tests, proposed algorithm surpassed best single-network performance reported thus far literature requiring 2.3 mb storage. furthermore, integrated effective forward implementation, recognition offline character image took 9.7 ms cpu. compared state-of-the-art cnn model hccr, approach approximately 30 times faster, yet 10 times cost efficient.",4 "tree memory networks modelling long-term temporal dependencies. domain sequence modelling, recurrent neural networks (rnn) capable achieving impressive results variety application areas including visual question answering, part-of-speech tagging machine translation. however success modelling short term dependencies successfully transitioned application areas trajectory prediction, require capturing short term long term relationships. paper, propose tree memory network (tmn) modelling long term short term relationships sequence-to-sequence mapping problems. proposed network architecture composed input module, controller memory module. contrast related literature, models memory sequence historical states, model memory recursive tree structure. structure effectively captures temporal dependencies across short term long term sequences using hierarchical structure. demonstrate effectiveness flexibility proposed tmn two practical problems, aircraft trajectory modelling pedestrian trajectory modelling surveillance setting, cases outperform current state-of-the-art. furthermore, perform depth analysis evolution memory module content time provide visual evidence proposed tmn able map long term short term relationships efficiently via hierarchical structure.",4 "semantic-aware grad-gan virtual-to-real urban scene adaption. recent advances vision tasks (e.g., segmentation) highly depend availability large-scale real-world image annotations obtained cumbersome human labors. moreover, perception performance often drops significantly new scenarios, due poor generalization capability models trained limited biased annotations. work, resort transfer knowledge automatically rendered scene annotations virtual-world facilitate real-world visual tasks. although virtual-world annotations ideally diverse unlimited, discrepant data distributions virtual real-world make challenging knowledge transferring. thus propose novel semantic-aware grad-gan (sg-gan) perform virtual-to-real domain adaption ability retaining vital semantic information. beyond simple holistic color/texture transformation achieved prior works, sg-gan successfully personalizes appearance adaption semantic region order preserve key characteristic better recognition. presents two main contributions traditional gans: 1) soft gradient-sensitive objective keeping semantic boundaries; 2) semantic-aware discriminator validating fidelity personalized adaptions respect semantic region. qualitative quantitative experiments demonstrate superiority sg-gan scene adaption state-of-the-art gans. evaluations semantic segmentation cityscapes show using adapted virtual images sg-gan dramatically improves segmentation performance original virtual data. release code https://github.com/peilun-li/sg-gan.",4 "approximated robust principal component analysis improved general scene background subtraction. research reported paper addresses fundamental task separation locally moving deforming image areas static globally moving background. builds latest developments field robust principal component analysis, specifically, recently reported practical solutions long-standing problem recovering low-rank sparse parts large matrix made sum two components. article addresses critical issues including: embedding global motion parameters matrix decomposition model, i.e., estimation global motion parameters simultaneously foreground/background separation task, considering matrix block-sparsity rather generic matrix sparsity natural feature video processing applications, attenuating background ghosting effects foreground subtracted, critically providing extremely efficient algorithm solve low-rank/sparse matrix decomposition task. first aspect important background/foreground separation generic video sequences background usually obeys global displacements originated camera motion capturing process. second aspect exploits fact video processing applications sparse matrix particular structure, non-zero matrix entries randomly distributed build small blocks within sparse matrix. next feature proposed approach addresses removal ghosting effects originated foreground silhouettes lack information occluded background regions image. finally, proposed model also tackles algorithmic complexity introducing extremely efficient ""svd-free"" technique applied background/foreground separation tasks conventional video processing.",4 "supervised hashing using graph cuts boosted decision trees. embedding image features binary hamming space improve speed accuracy large-scale query-by-example image retrieval systems. supervised hashing aims map original features compact binary codes manner preserves label-based similarities original data. existing approaches apply single form hash function, optimization process typically deeply coupled specific form. tight coupling restricts flexibility methods, result complex optimization problems difficult solve. work proffer flexible yet simple framework able accommodate different types loss functions hash functions. proposed framework allows number existing approaches hashing placed context, simplifies development new problem-specific hashing methods. framework decomposes two steps: binary code (hash bits) learning, hash function learning. first step typically formulated binary quadratic problem, second step accomplished training standard binary classifiers. solving large-scale binary code inference, show ensure binary quadratic problems submodular efficient graph cut approach used. achieve efficiency well efficacy large-scale high-dimensional data, propose use boosted decision trees hash functions, nonlinear, highly descriptive, fast train evaluate. experiments demonstrate proposed method significantly outperforms state-of-the-art methods, especially high-dimensional data.",4 "unsupervised activity discovery characterization event-streams. present framework discover characterize different classes everyday activities event-streams. begin representing activities bags event n-grams. allows us analyze global structural information activities, using local event statistics. demonstrate maximal cliques undirected edge-weighted graph activities, used activity-class discovery unsupervised manner. show modeling activity variable length markov process, used discover recurrent event-motifs characterize discovered activity-classes. present results extensive data-sets, collected multiple active environments, show competence generalizability proposed framework.",4 "hyperspectral image superresolution: edge-preserving convex formulation. hyperspectral remote sensing images (hsis) characterized low spatial resolution high spectral resolution, whereas multispectral images (msis) characterized low spectral high spatial resolutions. complementary characteristics stimulated active research inference images high spatial spectral resolutions hsi-msi pairs. paper, formulate data fusion problem minimization convex objective function containing two data-fitting terms edge-preserving regularizer. data-fitting terms quadratic account blur, different spatial resolutions, additive noise; regularizer, form vector total variation, promotes aligned discontinuities across reconstructed hyperspectral bands. optimization described rather hard, owing non-diagonalizable linear operators, non-quadratic non-smooth nature regularizer, large size image inferred. tackle difficulties tailoring split augmented lagrangian shrinkage algorithm (salsa)---an instance alternating direction method multipliers (admm)---to optimization problem. using convenient variable splitting exploiting fact hsis generally ""live"" low-dimensional subspace, obtain effective algorithm yields state-of-the-art results, illustrated experiments.",4 "optimal transport maps distribution preserving operations latent spaces generative models. generative models variational auto encoders (vaes) generative adversarial networks (gans) typically trained fixed prior distribution latent space, uniform gaussian. trained model obtained, one sample generator various forms exploration understanding, interpolating two samples, sampling vicinity sample exploring differences pair samples applied third sample. paper, show latent space operations used literature far induce distribution mismatch resulting outputs prior distribution model trained on. address this, propose use distribution matching transport maps ensure latent space operations preserve prior distribution, minimally modifying original operation. experimental results validate proposed operations give higher quality samples compared original operations.",4 "closed-form marginal likelihood gamma-poisson factorization. present novel understandings gamma-poisson (gap) model, probabilistic matrix factorization model count data. show gap rewritten free score/activation matrix. gives us new insights estimation topic/dictionary matrix maximum marginal likelihood estimation. particular, explains robustness estimator over-specified values factorization rank particular ability automatically prune spurious dictionary columns, empirically observed previous work. marginalization activation matrix leads turn new monte-carlo expectation-maximization algorithm favorable properties.",19 "labelbank: revisiting global perspectives semantic segmentation. semantic segmentation requires detailed labeling image pixels object category. information derived local image patches necessary describe detailed shape individual objects. however, information ambiguous result noisy labels. global inference image content instead capture general semantic concepts present. advocate holistic inference image concepts provides valuable information detailed pixel labeling. propose generic framework leverage holistic information form labelbank pixel-level segmentation. show ability framework improve semantic segmentation performance variety settings. learn models extracting holistic labelbank visual cues, attributes, and/or textual descriptions. demonstrate improvements semantic segmentation accuracy standard datasets across range state-of-the-art segmentation architectures holistic inference approaches.",4 "multi-parametric solution-path algorithm instance-weighted support vector machines. instance-weighted variant support vector machine (svm) attracted considerable attention recently since useful various machine learning tasks non-stationary data analysis, heteroscedastic data modeling, transfer learning, learning rank, transduction. important challenge scenarios overcome computational bottleneck---instance weights often change dynamically adaptively, thus weighted svm solutions must repeatedly computed. paper, develop algorithm efficiently exactly update weighted svm solutions arbitrary change instance weights. technically, contribution regarded extension conventional solution-path algorithm single regularization parameter multiple instance-weight parameters. however, extension gives rise significant problem breakpoints (at solution path turns) identified high-dimensional space. facilitate this, introduce parametric representation instance weights. also provide geometric interpretation weight space using notion critical region: polyhedron current affine solution remains optimal. find breakpoints intersections solution path boundaries polyhedrons. extensive experiments various practical applications, demonstrate usefulness proposed algorithm.",4 "algorithm runtime prediction: methods & evaluation. perhaps surprisingly, possible predict long algorithm take run previously unseen input, using machine learning techniques build model algorithm's runtime function problem-specific instance features. models important applications algorithm analysis, portfolio-based algorithm selection, automatic configuration parameterized algorithms. past decade, wide variety techniques studied building models. here, describe extensions improvements existing models, new families models, -- perhaps importantly -- much thorough treatment algorithm parameters model inputs. also comprehensively describe new existing features predicting algorithm runtime propositional satisfiability (sat), travelling salesperson (tsp) mixed integer programming (mip) problems. evaluate innovations largest empirical analysis kind, comparing wide range runtime modelling techniques literature. experiments consider 11 algorithms 35 instance distributions; also span wide range sat, mip, tsp instances, least structured generated uniformly random structured emerged real industrial applications. overall, demonstrate new models yield substantially better runtime predictions previous approaches terms generalization new problem instances, new algorithms parameterized space, simultaneously.",4 "robust principal component analysis graphs. principal component analysis (pca) widely used tool linear dimensionality reduction clustering. still highly sensitive outliers scale well respect number data samples. robust pca solves first issue sparse penalty term. second issue handled matrix factorization model, however non-convex. besides, pca based clustering also enhanced using graph data similarity. article, introduce new model called ""robust pca graphs"" incorporates spectral graph regularization robust pca framework. proposed model benefits 1) robustness principal components occlusions missing values, 2) enhanced low-rank recovery, 3) improved clustering property due graph smoothness assumption low-rank matrix, 4) convexity resulting optimization problem. extensive experiments 8 benchmark, 3 video 2 artificial datasets corruptions clearly reveal model outperforms 10 state-of-the-art models clustering low-rank recovery tasks.",4 mesa: maximum entropy simulated annealing. probabilistic reasoning systems combine different probabilistic rules probabilistic facts arrive desired probability values consequences. paper describe mesa-algorithm (maximum entropy simulated annealing) derives joint distribution variables propositions. takes account reliability probability values resolve conflicts contradictory statements. joint distribution represented terms marginal distributions therefore allows process large inference networks determine desired probability values high precision. procedure derives maximum entropy distribution subject given constraints. applied inference networks arbitrary topology may extended number directions.,4 "that's fact: distinguishing factual emotional argumentation online dialogue. investigate characteristics factual emotional argumentation styles observed online debates. using annotated set ""factual"" ""feeling"" debate forum posts, extract patterns highly correlated factual emotional arguments, apply bootstrapping methodology find new patterns larger pool unannotated forum posts. process automatically produces large set patterns representing linguistic expressions highly correlated factual emotional language. finally, analyze discriminating patterns better understand defining characteristics factual emotional arguments.",4 "chinese text wild. introduce chinese text wild, large dataset chinese text street view images. optical character recognition (ocr) document images well studied many commercial tools available, detection recognition text natural images still challenging problem, especially complicated character sets chinese text. lack training data always problem, especially deep learning methods require massive training data. paper provide details newly created dataset chinese text 1 million chinese characters annotated experts 30 thousand street view images. challenging dataset good diversity. contains planar text, raised text, text cities, text rural areas, text poor illumination, distant text, partially occluded text, etc. character dataset, annotation includes underlying character, bounding box, 6 attributes. attributes indicate whether complex background, whether raised, whether handwritten printed, etc. large size diversity dataset make suitable training robust neural networks various tasks, particularly detection recognition. give baseline results using several state-of-the-art networks, including alexnet, overfeat, google inception resnet character recognition, yolov2 character detection images. overall google inception best performance recognition 80.5% top-1 accuracy, yolov2 achieves map 71.0% detection. dataset, source code trained models publicly available website.",4 "variable computation recurrent neural networks. recurrent neural networks (rnns) used extensively increasing success model various types sequential data. much progress achieved devising recurrent units architectures flexibility capture complex statistics data, long range dependency localized attention phenomena. however, many sequential data (such video, speech language) highly variable information flow, recurrent models still consume input features constant rate perform constant number computations per time step, detrimental speed model capacity. paper, explore modification existing recurrent units allows learn vary amount computation perform step, without prior knowledge sequence's time structure. show experimentally models require fewer operations, also lead better performance overall evaluation tasks.",19 "learning avoid errors gans manipulating input spaces. despite recent advances, large scale visual artifacts still common occurrence images generated gans. previous work focused improving generator's capability accurately imitate data distribution $p_{data}$. paper, instead explore methods enable gans actively avoid errors manipulating input space. core idea apply small changes noise vector order shift away areas input space tend result errors. derive three different architectures idea. main one consists simple residual module leads significantly less visual artifacts, slightly decreasing diversity. module trivial add existing gans costs almost zero computation memory.",19 "adaboost forward stagewise regression first-order convex optimization methods. boosting methods highly popular effective supervised learning methods combine weak learners single accurate model good statistical performance. paper, analyze two well-known boosting methods, adaboost incremental forward stagewise regression (fs$_\varepsilon$), establishing precise connections mirror descent algorithm, first-order method convex optimization. consequence connections obtain novel computational guarantees boosting methods. particular, characterize convergence bounds adaboost, related margin log-exponential loss function, step-size sequence. furthermore, paper presents, first time, precise computational complexity results fs$_\varepsilon$.",19 "discriminative kalman filter nonlinear non-gaussian sequential bayesian filtering. kalman filter (kf) used variety applications computing posterior distribution latent states state space model. model requires linear relationship states observations. extensions kalman filter proposed incorporate linear approximations nonlinear models, extended kalman filter (ekf) unscented kalman filter (ukf). however, argue cases dimensionality observed variables greatly exceeds dimensionality state variables, model $p(\text{state}|\text{observation})$ proves easier learn accurate latent space estimation. derive validate call discriminative kalman filter (dkf): closed-form discriminative version bayesian filtering readily incorporates off-the-shelf discriminative learning techniques. further, demonstrate given mild assumptions, highly non-linear models $p(\text{state}|\text{observation})$ specified. motivate validate synthetic datasets neural decoding non-human primates, showing substantial increases decoding performance versus standard kalman filter.",19 "learning fusing multimodal features multi-task facial computing. propose deep learning-based feature fusion approach facial computing including face recognition well gender, race age detection. instead training single classifier face images classify based features person whose face appears image, first train four different classifiers classifying face images based race, age, gender identification (id). multi-task features extracted trained models cross-task-feature training conducted shows value fusing multimodal features extracted multi-tasks. found features trained one task used related tasks. interestingly, features trained task classes (e.g. id) used another task fewer classes (e.g. race) outperforms features trained task itself. final feature fusion performed combining four types features extracted images four classifiers. feature fusion approach improves classifications accuracy 7.2%, 20.1%, 22.2%, 21.8% margin, respectively, id, age, race gender recognition, results single classifiers trained individual features. proposed method applied applications different types data features extracted.",4 "learned deep representations action recognition?. success deep models led deployment areas computer vision, increasingly important understand representations work capturing. paper, shed light deep spatiotemporal representations visualizing two-stream models learned order recognize actions video. show local detectors appearance motion objects arise form distributed representations recognizing human actions. key observations include following. first, cross-stream fusion enables learning true spatiotemporal features rather simply separate appearance motion features. second, networks learn local representations highly class specific, also generic representations serve range classes. third, throughout hierarchy network, features become abstract show increasing invariance aspects data unimportant desired distinctions (e.g. motion patterns across various speeds). fourth, visualizations used shed light learned representations, also reveal idiosyncracies training data explain failure cases system.",4 "video event recognition surveillance applications (versa). versa provides general-purpose framework defining recognizing events live recorded surveillance video streams. approach event recognition versa using declarative logic language define spatial temporal relationships characterize given event activity. requires definition certain fundamental spatial temporal relationships high-level syntax specifying frame templates query parameters. although handling uncertainty current versa implementation simplistic, language architecture amenable extending using fuzzy logic similar approaches. versa's high-level architecture designed work xml-based, services- oriented environments. versa thought subscribing xml annotations streamed lower-level video analytics service provides basic entity detection, labeling, tracking. one many versa event monitors could thus analyze video streams provide alerts certain events detected.",4 "spectral experts estimating mixtures linear regressions. discriminative latent-variable models typically learned using em gradient-based optimization, suffer local optima. paper, develop new computationally efficient provably consistent estimator mixture linear regressions, simple instance discriminative latent-variable model. approach relies low-rank linear regression recover symmetric tensor, factorized parameters using tensor power method. prove rates convergence estimator provide empirical evaluation illustrating strengths relative local optimization (em).",4 "towards generalization simplicity continuous control. work shows policies simple linear rbf parameterizations trained solve variety continuous control tasks, including openai gym benchmarks. performance trained policies competitive state art results, obtained elaborate parameterizations fully connected neural networks. furthermore, existing training testing scenarios shown limited prone over-fitting, thus giving rise trajectory-centric policies. training diverse initial state distribution shown produce global policies better generalization. allows interactive control scenarios system recovers large on-line perturbations; shown supplementary video.",4 "thermal visible synthesis face images using multiple regions. synthesis visible spectrum faces thermal facial imagery promising approach heterogeneous face recognition; enabling existing face recognition software trained visible imagery leveraged, allowing human analysts verify cross-spectrum matches effectively. propose new synthesis method enhance discriminative quality synthesized visible face imagery leveraging global (e.g., entire face) local regions (e.g., eyes, nose, mouth). here, region provides (1) independent representation corresponding area, (2) additional regularization terms, impact overall quality synthesized images. analyze effects using multiple regions synthesize visible face image thermal face. demonstrate approach improves cross-spectrum verification rates recently published synthesis approaches. moreover, using synthesized imagery, report results facial landmark detection-commonly used image registration-which critical part face recognition process.",4 "attention attention: architectures visual question answering (vqa). visual question answering (vqa) increasingly popular topic deep learning research, requiring coordination natural language processing computer vision modules single architecture. build upon model placed first vqa challenge developing thirteen new attention mechanisms introducing simplified classifier. performed 300 gpu hours extensive hyperparameter architecture searches able achieve evaluation score 64.78%, outperforming existing state-of-the-art single model's validation score 63.15%.",4 "robust method vote aggregation proposition verification invariant local features. paper presents method analysis vote space created local features extraction process multi-detection system. method opposed classic clustering approach gives high level control clusters composition verification steps. proposed method comprises graphical vote space presentation, proposition generation, two-pass iterative vote aggregation cascade filters verification propositions. cascade filters contain minor algorithms needed effective object detection verification. new approach drawbacks classic clustering approaches gives substantial control process detection. method exhibits exceptionally high detection rate conjunction low false detection chance comparison alternative methods.",4 "positive definite kernels machine learning. survey introduction positive definite kernels set methods inspired machine learning literature, namely kernel methods. first discuss properties positive definite kernels well reproducing kernel hibert spaces, natural extension set functions $\{k(x,\cdot),x\in\mathcal{x}\}$ associated kernel $k$ defined space $\mathcal{x}$. discuss length construction kernel functions take advantage well-known statistical models. provide overview numerous data-analysis methods take advantage reproducing kernel hilbert spaces discuss idea combining several kernels improve performance certain tasks. also provide short cookbook different kernels particularly useful certain data-types images, graphs speech segments.",19 "comparing neural attractiveness-based visual features artwork recommendation. advances image processing computer vision latest years brought use visual features artwork recommendation. recent works shown visual features obtained pre-trained deep neural networks (dnns) perform well recommending digital art. recent works shown explicit visual features (evf) based attractiveness perform well preference prediction tasks, previous work compared dnn features versus specific attractiveness-based visual features (e.g. brightness, texture) terms recommendation performance. work, study compare performance dnn evf features purpose physical artwork recommendation using transactional data ugallery, online store physical paintings. addition, perform exploratory analysis understand dnn embedded features relation certain evf. results show dnn features outperform evf, certain evf features suited physical artwork recommendation and, finally, show evidence certain neurons dnn might partially encoding visual features brightness, providing opportunity explaining recommendations based visual neural models.",4 "learning visualizing localized geometric features using 3d-cnn: application manufacturability analysis drilled holes. 3d convolutional neural networks (3d-cnn) used object recognition based voxelized shape object. however, interpreting decision making process 3d-cnns still infeasible task. paper, present unique 3d-cnn based gradient-weighted class activation mapping method (3d-gradcam) visual explanations distinct local geometric features interest within object. enable efficient learning 3d geometries, augment voxel data surface normals object boundary. train 3d-cnn augmented data identify local features critical decision-making using 3d gradcam. application feature identification framework recognize difficult-to-manufacture drilled hole features complex cad geometry. framework extended identify difficult-to-manufacture features multiple spatial scales leading real-time design manufacturability decision support system.",19 "twitter hash tag recommendation. rise popularity microblogging services like twitter led increased use content annotation strategies like hashtag. hashtags provide users tagging mechanism help organize, group, create visibility posts. simple idea challenging user practice leads infrequent usage. paper, investigate various methods recommending hashtags new posts created encourage widespread adoption usage. hashtag recommendation comes numerous challenges including processing huge volumes streaming data content small noisy. investigate preprocessing methods reduce noise data determine effective method hashtag recommendation based popular classification algorithms.",4 "deep active learning named entity recognition. deep learning yielded state-of-the-art performance many natural language processing tasks including named entity recognition (ner). however, typically requires large amounts labeled data. work, demonstrate amount labeled training data drastically reduced deep learning combined active learning. active learning sample-efficient, computationally expensive since requires iterative retraining. speed up, introduce lightweight architecture ner, viz., cnn-cnn-lstm model consisting convolutional character word encoders long short term memory (lstm) tag decoder. model achieves nearly state-of-the-art performance standard datasets task computationally much efficient best performing models. carry incremental active learning, training process, able nearly match state-of-the-art performance 25\% original training data.",4 "stochastic neural networks hierarchical reinforcement learning. deep reinforcement learning achieved many impressive results recent years. however, tasks sparse rewards long horizons continue pose significant challenges. tackle important problems, propose general framework first learns useful skills pre-training environment, leverages acquired skills learning faster downstream tasks. approach brings together strengths intrinsic motivation hierarchical methods: learning useful skill guided single proxy reward, design requires minimal domain knowledge downstream tasks. high-level policy trained top skills, providing significant improvement exploration allowing tackle sparse rewards downstream tasks. efficiently pre-train large span skills, use stochastic neural networks combined information-theoretic regularizer. experiments show combination effective learning wide span interpretable skills sample-efficient way, significantly boost learning performance uniformly across wide range downstream tasks.",4 "successive nonnegative projection algorithm robust nonnegative blind source separation. paper, propose new fast robust recursive algorithm near-separable nonnegative matrix factorization, particular nonnegative blind source separation problem. algorithm, refer successive nonnegative projection algorithm (snpa), closely related popular successive projection algorithm (spa), takes advantage nonnegativity constraint decomposition. prove snpa robust spa applied broader class nonnegative matrices. illustrated synthetic data sets, real-world hyperspectral image.",19 "binary schema computational algorithms process vowel-based euphonic conjunctions word searches. comprehensively searching words sanskrit e-text non-trivial problem words could change forms different contexts. one context sandhi euphonic conjunctions, cause word change owing presence adjacent letters words. change wrought possible conjunctions significant sanskrit simple search word given form alone significantly reduce success level search. work presents representational schema represents letters binary format reduces paninian rules euphonic conjunctions simple bit set-unset operations. work presents efficient algorithm process vowel-based sandhis using schema. presents another algorithm uses sandhi processor generate possible transformed word forms given word use comprehensive word search.",4 "robust dictionary based data representation. robustness noise outliers important issue linear representation real applications. focus problem samples grossly corrupted, also 'sample specific' corruptions problem. reasonable assumption corrupted samples cannot represented dictionary clean samples well represented. assumption enforced paper investigating coefficients corrupted samples. concretely, require coefficients corrupted samples zero. way, representation quality clean data assured without effect corrupted data. last, robust dictionary based data representation approach sparse representation version proposed, directive significance future applications.",4 "echo state queueing network: new reservoir computing learning tool. last decade, new computational paradigm introduced field machine learning, name reservoir computing (rc). rc models neural networks recurrent part (the reservoir) participate learning process, rest system recurrence (no neural circuit) occurs. approach grown rapidly due success solving learning tasks computational applications. success also observed another recently proposed neural network designed using queueing theory, random neural network (randnn). approaches good properties identified drawbacks. paper, propose new rc model called echo state queueing network (esqn), use ideas coming randnns design reservoir. esqns consist esns reservoir new dynamics inspired recurrent randnns. paper positions esqns global machine learning area, provides examples use performances. show largely used benchmarks esqns accurate tools, illustrate compare standard esns.",4 "multi-task averaging. present multi-task learning approach jointly estimate means multiple independent data sets. proposed multi-task averaging (mta) algorithm results convex combination single-task maximum likelihood estimates. derive optimal minimum risk estimator minimax estimator, show estimators efficiently estimated. simulations real data experiments demonstrate mta estimators often outperform single-task james-stein estimators.",19 "panoptic studio: massively multiview system social interaction capture. present approach capture 3d motion group people engaged social interaction. core challenges capturing social interactions are: (1) occlusion functional frequent; (2) subtle motion needs measured space large enough host social group; (3) human appearance configuration variation immense; (4) attaching markers body may prime nature interactions. panoptic studio system organized around thesis social interactions measured integration perceptual analyses large variety view points. present modularized system designed around principle, consisting integrated structural, hardware, software innovations. system takes, input, 480 synchronized video streams multiple people engaged social activities, produces, output, labeled time-varying 3d structure anatomical landmarks individuals space. algorithm designed fuse ""weak"" perceptual processes large number views progressively generating skeletal proposals low-level appearance cues, framework temporal refinement also presented associating body parts reconstructed dense 3d trajectory stream. system method first reconstructing full body motion five people engaged social interactions without using markers. also empirically demonstrate impact number views achieving goal.",4 "estimating individual treatment effect observational data using random forest methods. estimation individual treatment effect observational data complicated due challenges confounding selection bias. useful inferential framework address counterfactual (potential outcomes) model takes hypothetical stance asking individual received treatments. making use random forests (rf) within counterfactual framework estimate individual treatment effects directly modeling response. find accurate estimation individual treatment effects possible even complex heterogeneous settings type rf approach plays important role accuracy. methods designed adaptive confounding, used parallel out-of-sample estimation, best. one method found especially promising counterfactual synthetic forests. illustrate new methodology applying large comparative effectiveness trial, project aware, order explore role drug use plays sexual risk. analysis reveals important connections risky behavior, drug usage, sexual risk.",19 "direct uncertainty estimation reinforcement learning. optimal probabilistic approach reinforcement learning computationally infeasible. simplification consisting neglecting difference true environment model estimated using limited number observations causes exploration vs exploitation problem. uncertainty expressed terms probability distribution space environment models, uncertainty propagated action-value function via bellman iterations, computationally insufficiently efficient though. consider possibility directly measuring uncertainty action-value function, analyze sufficiency facilitated approach.",4 "performance localisation. performance becomes issue particularly execution cost hinders functionality program. typically profiler used find program code execution represents large portion overall execution cost program. pinpointing performance issue exists provides starting point tracing cause back program. profiling shows performance issue manifests, use mutation analysis show performance improvement likely exist. find mutation analysis indicate locations within program highly impactful overall execution cost program yet executed relatively infrequently. better locating potential performance improvements programs hope make performance improvement amenable automation.",4 "linear-time algorithm bayesian image denoising based gaussian markov random field. paper, consider bayesian image denoising based gaussian markov random field (gmrf) model, propose new algorithm. method solve bayesian image denoising problems, including hyperparameter estimation, $o(n)$-time, $n$ number pixels given image. perspective order computational time, state-of-the-art algorithm present problem setting. moreover, results numerical experiments show method fact effective practice.",19 "approach reducing annotation costs bionlp. broad range bionlp tasks active learning (al) significantly reduce annotation costs specific al algorithm developed particularly effective reducing annotation costs tasks. previously developed al algorithm called closestinitpa works best tasks following characteristics: redundancy training material, burdensome annotation costs, support vector machines (svms) work well task, imbalanced datasets (i.e. set binary classification problem, one class substantially rarer other). many bionlp tasks characteristics thus al algorithm natural approach apply bionlp tasks.",4 "context-dependent fine-grained entity type tagging. entity type tagging task assigning category labels mention entity document. standard systems focus small set types, recent work (ling weld, 2012) suggests using large fine-grained label set lead dramatic improvements downstream tasks. absence labeled training data, existing fine-grained tagging systems obtain examples automatically, using resolved entities types extracted knowledge base. however, since appropriate type often depends context (e.g. washington could tagged either city government), procedure result spurious labels, leading poorer generalization. propose task context-dependent fine type tagging, set acceptable labels mention restricted deducible local context (e.g. sentence document). introduce new resources task: 12,017 mentions annotated context-dependent fine types, provide baseline experimental results data.",4 "camera pose filtering local regression geodesics riemannian manifold dual quaternions. time-varying, smooth trajectory estimation great interest vision community accurate well behaving 3d systems. paper, propose novel principal component local regression filter acting directly riemannian manifold unit dual quaternions $\mathbb{d} \mathbb{h}_1$. use numerically stable lie algebra dual quaternions together $\exp$ $\log$ operators locally linearize 6d pose space. unlike state art path smoothing methods either operate $so\left(3\right)$ rotation matrices hypersphere $\mathbb{h}_1$ quaternions, treat orientation translation jointly dual quaternion quadric 7-dimensional real projective space $\mathbb{r}\mathbb{p}^7$. provide outlier-robust irls algorithm generic pose filtering exploiting manifold structure. besides theoretical analysis, experiments synthetic real data show practical advantages manifold aware filtering pose tracking smoothing.",4 "empirical model acknowledgment spoken-language systems. refine extend prior views description, purposes, contexts-of-use acknowledgment acts empirical examination use acknowledgments task-based conversation. distinguish three broad classes acknowledgments (other-->ackn, self-->other-->ackn, self+ackn) present catalogue 13 patterns within classes account specific uses acknowledgment corpus.",2 "generative adversarial nets multiple text corpora. generative adversarial nets (gans) successfully applied artificial generation image data. terms text data, much done artificial generation natural language single corpus. consider multiple text corpora input data, two applications gans: (1) creation consistent cross-corpus word embeddings given different word embeddings per corpus; (2) generation robust bag-of-words document embeddings corpora. demonstrate gan models real-world text data sets different corpora, show embeddings models lead improvements supervised learning problems.",4 plummer autoencoders. estimating true density high-dimensional feature spaces well-known problem machine learning. work shows possible formulate optimization problem minimization use representational power neural networks learn complex densities. theoretical bound estimation error given dealing finite number samples. proposed theory corroborated extensive experiments different datasets compared several existing approaches families generative adversarial networks autoencoder-based models.,4 "regularization approach blind deblurring denoising qr barcodes. qr bar codes prototypical images part image priori known (required patterns). open source bar code readers, zbar, readily available. exploit facts provide assess purely regularization-based methods blind deblurring qr bar codes presence noise.",4 "multi-scale mining fmri data hierarchical structured sparsity. inverse inference, ""brain reading"", recent paradigm analyzing functional magnetic resonance imaging (fmri) data, based pattern recognition statistical learning. predicting cognitive variables related brain activation maps, approach aims decoding brain activity. inverse inference takes account multivariate information voxels currently way assess precisely cognitive information encoded activity neural populations within whole brain. however, relies prediction function plagued curse dimensionality, since far features samples, i.e., voxels fmri volumes. address problem, different methods proposed, as, among others, univariate feature selection, feature agglomeration regularization techniques. paper, consider sparse hierarchical structured regularization. specifically, penalization use constructed tree obtained spatially-constrained agglomerative clustering. approach encodes spatial structure data different scales regularization, makes overall prediction procedure robust inter-subject variability. regularization used induces selection spatially coherent predictive brain regions simultaneously different scales. test algorithm real data acquired study mental representation objects, show proposed algorithm delineates meaningful brain regions yields well better prediction accuracy reference methods.",19 "interactive multiclass segmentation using superpixel classification. paper adresses problem interactive multiclass segmentation. propose fast efficient new interactive segmentation method called superpixel classification-based interactive segmentation (scis). strokes drawn human user image, method extracts relevant semantic objects. get fast calculation accurate segmentation, scis uses superpixel over-segmentation support vector machine classification. paper, demonstrate scis significantly outperfoms competing algorithms evaluating performances reference benchmarks mcguinness santner.",4 "deep matching prior network: toward tighter multi-oriented text detection. detecting incidental scene text challenging task multi-orientation, perspective distortion, variation text size, color scale. retrospective research focused using rectangular bounding box horizontal sliding window localize text, may result redundant background noise, unnecessary overlap even information loss. address issues, propose new convolutional neural networks (cnns) based method, named deep matching prior network (dmpnet), detect text tighter quadrangle. first, use quadrilateral sliding windows several specific intermediate convolutional layers roughly recall text higher overlapping area shared monte-carlo method proposed fast accurate computing polygonal areas. that, designed sequential protocol relative regression exactly predict text compact quadrangle. moreover, auxiliary smooth ln loss also proposed regressing position text, better overall performance l2 loss smooth l1 loss terms robustness stability. effectiveness approach evaluated public word-level, multi-oriented scene text database, icdar 2015 robust reading competition challenge 4 ""incidental scene text localization"". performance method evaluated using f-measure found 70.64%, outperforming existing state-of-the-art method f-measure 63.76%.",4 "learning document image binarization data. paper present fully trainable binarization solution degraded document images. unlike previous attempts often used simple features series pre- post-processing, solution encodes heuristics whether pixel foreground text high-dimensional feature vector learns complicated decision function. particular, prepare features three types: 1) existing features binarization intensity [1], contrast [2], [3], laplacian [4], [5]; 2) reformulated features existing binarization decision functions [6] [7]; 3) newly developed features, namely logarithm intensity percentile (lip) relative darkness index (rdi). initial experimental results show using selected samples (about 1.5% available training data), achieve binarization performance comparable fine-tuned (typically hand), state-of-the-art methods. additionally, trained document binarization classifier shows good generalization capabilities out-of-domain data.",4 "extracting bilingual persian italian lexicon comparable corpora using different types seed dictionaries. bilingual dictionaries important various fields natural language processing. recent years, research extracting new bilingual lexicons non-parallel (comparable) corpora proposed. almost use small existing dictionary resource make initial list called ""seed dictionary"". paper discuss use different types dictionaries initial starting list creating bilingual persian-italian lexicon comparable corpus. experiments apply state-of-the-art techniques three different seed dictionaries; existing dictionary, dictionary created pivot-based schema, dictionary extracted small persian-italian parallel text. interesting challenge approach find way combine different dictionaries together order produce better accurate lexicon. order combine seed dictionaries, propose two different combination models examine effect novel combination models various comparable corpora differing degrees comparability. conclude proposal new weighting system improve extracted lexicon. experimental results produced implementation show efficiency proposed models.",4 "evidential reasoning parallel hierarchical vision programs. paper presents efficient adaptation application dempster-shafer theory evidence, one used effectively massively parallel hierarchical system visual pattern perception. describes techniques used, shows extended example serve improve system's performance applies multiple-level set processes.",4 "qualitative shape representation based qualitative relative direction distance calculus eopram. document serves brief technical report, detailing processes used represent reconstruct simplified polygons using qualitative spatial descriptions, defined eopram qualitative spatial calculus.",4 "go deep wide learning?. achieve acceptable performance ai tasks, one either use sophisticated feature extraction methods first layer two-layered supervised learning model, learn features directly using deep (multi-layered) model. first approach problem-specific, second approach computational overheads learning multiple layers fine-tuning model. paper, propose approach called wide learning based arc-cosine kernels, learns single layer infinite width. propose exact inexact learning strategies wide learning show wide learning single layer outperforms single layer well deep architectures finite width benchmark datasets.",4 "fastmask: segment multi-scale object candidates one shot. objects appear scale differently natural images. fact requires methods dealing object-centric tasks (e.g. object proposal) robust performance variances object scales. paper, present novel segment proposal framework, namely fastmask, takes advantage hierarchical features deep convolutional neural networks segment multi-scale objects one shot. innovatively, adapt segment proposal network three different functional components (body, neck head). propose weight-shared residual neck module well scale-tolerant attentional head module efficient one-shot inference. ms coco benchmark, proposed fastmask outperforms state-of-the-art segment proposal methods average recall 2~5 times faster. moreover, slight trade-off accuracy, fastmask segment objects near real time (~13 fps) 800*600 resolution images, demonstrating potential practical applications. implementation available https://github.com/voidrank/fastmask.",4 "deep learning reverse photon migration diffuse optical tomography. artificial intelligence (ai) learn complicated non-linear physics? propose novel deep learning approach learns non-linear photon scattering physics obtains accurate 3d distribution optical anomalies. contrast traditional black-box deep learning approaches inverse problems, deep network learns invert lippmann-schwinger integral equation describes essential physics photon migration diffuse near-infrared (nir) photons turbid media. example clinical relevance, applied method prototype diffuse optical tomography (dot). show deep neural network, trained simulation data, accurately recover location anomalies within biomimetic phantoms live animals without use exogenous contrast agent.",4 "bridging gap reinforcement learning knowledge representation: logical off- on-policy framework. knowledge representation important issue reinforcement learning. paper, bridge gap reinforcement learning knowledge representation, providing rich knowledge representation framework, based normal logic programs answer set semantics, capable solving model-free reinforcement learning problems complex do-mains exploits domain-specific knowledge. prove correctness approach. show complexity finding offline online policy model-free reinforcement learning problem approach np-complete. moreover, show model-free reinforcement learning problem mdp environment encoded sat problem. importance model-free reinforcement",4 "dirichlet process mixed random measures: nonparametric topic model labeled data. describe nonparametric topic model labeled data. model uses mixture random measures (mrm) base distribution dirichlet process (dp) hdp framework, call dp-mrm. model labeled data, define dp distributed random measure label, resulting model generates unbounded number topics label. apply dp-mrm single-labeled multi-labeled corpora documents compare performance label prediction medlda, lda-svm, labeled-lda. enhance model incorporating ddcrp modeling multi-labeled images image segmentation object labeling, comparing performance ncuts rddcrp.",4 "robust distributed online prediction. standard model online prediction deals serial processing inputs single processor. however, large-scale online prediction problems, inputs arrive high rate, increasingly common necessity distribute computation across several processors. non-trivial challenge design distributed algorithms online prediction, maintain good regret guarantees. \cite{dmb}, presented dmb algorithm, generic framework convert serial gradient-based online prediction algorithm distributed algorithm. moreover, regret guarantee asymptotically optimal smooth convex loss functions stochastic inputs. flip side, fragile many types failures common distributed environments. companion paper, present variants dmb algorithm, resilient many types network failures, tolerant varying performance computing nodes.",4 "inference minimizing size, divergence, sum. speed marginal inference ignoring factors significantly contribute overall accuracy. order pick suitable subset factors ignore, propose three schemes: minimizing number model factors bound kl divergence pruned full models; minimizing kl divergence bound factor count; minimizing weighted sum kl divergence factor count. three problems solved using approximation kl divergence calculated terms marginals computed simple seed graph. applied synthetic image denoising three different types nlp parsing models, technique performs marginal inference 11 times faster loopy bp, graph sizes reduced 98%-at comparable error marginals parsing accuracy. also show minimizing weighted sum divergence size substantially faster minimizing either objectives based approximation divergence presented here.",4 "refining source representations relation networks neural machine translation. although neural machine translation (nmt) encoder-decoder framework achieved great success recent times, still suffers drawbacks: rnns tend forget old information often useful encoder operates words without considering word relationship. solve problems, introduce relation networks (rn) nmt refine encoding representations source. method, rn first augments representation source word neighbors reasons possible pairwise relations them. source representations relations fed attention module decoder together, keeping main encoder-decoder architecture unchanged. experiments two chinese-to-english data sets different scales show method outperform competitive baselines significantly.",4 "narrative science systems: review. automatic narration events entities need hour, especially live reporting critical volume information narrated huge. paper discusses challenges context, along algorithms used build systems. systematic study, infer work done area related statistical data. also found subjective evaluation contribution experts also limited narration context.",4 "tensor regression networks various low-rank tensor approximations. tensor regression networks achieve high rate compression model parameters multilayer perceptrons (mlp) slight impact performances. tensor regression layer imposes low-rank constraints tensor regression layer replaces flattening operation traditional mlp. investigate tensor regression networks using various low-rank tensor approximations, aiming leverage multi-modal structure high dimensional data enforcing efficient low-rank constraints. provide theoretical analysis giving insights choice rank parameters. evaluated performance proposed model state-of-the-art deep convolutional models. cifar-10 dataset, achieved compression rate 0.018 sacrifice accuracy less 1%.",4 "detection tracking liquids fully convolutional networks. recent advances ai robotics claimed many incredible results deep learning, yet work date applied deep learning problem liquid perception reasoning. paper, apply fully-convolutional deep neural networks tasks detecting tracking liquids. evaluate three models: single-frame network, multi-frame network, lstm recurrent network. results show best liquid detection results achieved aggregating data multiple frames, contrast standard image segmentation. also show lstm network outperforms two tasks. suggests lstm-based neural networks potential key component enabling robots handle liquids using robust, closed-loop controllers.",4 "slugbot: application novel scalable open domain socialbot framework. paper introduce novel, open domain socialbot amazon alexa prize competition, aimed carrying friendly conversations users variety topics. present modular system, highlighting different data sources use human mind model data management. additionally build employ natural language understanding information retrieval tools apis expand knowledge bases. describe semistructured, scalable framework crafting topic-specific dialogue flows, give details dialogue management schemes scoring mechanisms. finally briefly evaluate performance system observe challenges open domain socialbot faces.",4 "deep representation learning part loss person re-identification. learning discriminative representations unseen person images critical person re-identification (reid). current approaches learn deep representations classification tasks, essentially minimize empirical classification risk training set. shown experiments, representations commonly focus several body parts discriminative training set, rather entire human body. inspired structural risk minimization principle svm, revise traditional deep representation learning procedure minimize empirical classification risk representation learning risk. representation learning risk evaluated proposed part loss, automatically generates several parts image, computes person classification loss part separately. compared traditional global classification loss, simultaneously considering multiple part loss enforces deep network focus entire human body learn discriminative representations different parts. experimental results three datasets, i.e., market1501, cuhk03, viper, show representation outperforms existing deep representations.",4 "bridge simulation metric estimation landmark manifolds. present inference algorithm connected monte carlo based estimation procedures metric estimation landmark configurations distributed according transition distribution riemannian brownian motion arising large deformation diffeomorphic metric mapping (lddmm) metric. distribution possesses properties similar regular euclidean normal distribution transition density governed high-dimensional pde closed-form solution nonlinear case. show density numerically approximated monte carlo sampling conditioned brownian bridges, use estimate parameters lddmm kernel thus metric structure maximum likelihood.",4 "deep ehr: survey recent advances deep learning techniques electronic health record (ehr) analysis. past decade seen explosion amount digital information stored electronic health records (ehr). primarily designed archiving patient clinical information administrative healthcare tasks, many researchers found secondary use records various clinical informatics tasks. period, machine learning community seen widespread advances deep learning techniques, also successfully applied vast amount ehr data. paper, review deep ehr systems, examining architectures, technical aspects, clinical applications. also identify shortcomings current techniques discuss avenues future research ehr-based deep learning.",4 "feature based approach video compression. high cost problem panoramic image stitching via image matching algorithm practical real-time performance. paper, take full advantage ofharris corner invariant characterization method light intensity parallel meaning, translation rotation, made realtime panoramic image stitching algorithm. according basic characteristics performance fpga classical algorithm, several modules feature point extraction, matching description optimize feature-based logic. real-time optimization system achieve high precision match. new algorithm process image pixel domain obtained ccd camera xilinx spartan-6 hardware platform. image stitching algorithm, eventually form portable interface output high-definition content display. results showed that, proposed algorithm higher precision good real-time performance robustness.",4 solving goddard problem influence diagram. influence diagrams decision-theoretic extension probabilistic graphical models. paper show used solve goddard problem. present results numerical experiments problem compare solutions provided influence diagrams optimal solution.,4 "study cuckoo optimization algorithm production planning problem. constrained nonlinear programming problems hard problems, one widely used common problems production planning problem optimize. study, one mathematical models production planning survey problem solved cuckoo algorithm. cuckoo algorithm efficient method solve continues non linear problem. moreover, mentioned models production planning solved genetic algorithm lingo software results compared. cuckoo algorithm suitable choice optimization convergence solution",12 "urban legends go viral?. urban legends genre modern folklore, consisting stories rare exceptional events, plausible enough believed, tend propagate inexorably across communities. view, urban legends represent form ""sticky"" deceptive text, marked tension credible incredible. credible like news article incredible like fairy tale go viral. particular focus idea urban legends mimic details news (who, where, when) credible, emotional readable like fairy tale catchy memorable. using nlp tools provide quantitative analysis prototypical characteristics. also lay machine learning experiments showing possible recognize urban legend using simple features.",4 "predicting privileged information height estimation. paper, propose novel regression-based method employing privileged information estimate height using human metrology. actual values anthropometric measurements difficult estimate accurately using state-of-the-art computer vision algorithms. hence, use ratios anthropometric measurements features. since many anthropometric measurements available test time real-life scenarios, employ learning using privileged information (lupi) framework regression setup. instead using lupi paradigm regression original form (i.e., \epsilon-svr+), train regression models predict privileged information test time. predictions used, along observable features, perform height estimation. height estimated, mapping classes performed. demonstrate proposed approach estimate height better faster \epsilon-svr+ algorithm report results different genders quartiles humans.",4 fitness-based adaptive control parameters genetic programming: adaptive value setting mutation rate flood mechanisms. paper concerns applications genetic algorithms genetic programming tasks difficult find representation map highly complex discontinuous fitness landscape. cases standard algorithm prone getting trapped local extremes. paper proposes several adaptive mechanisms useful preventing search getting trapped.,4 "multi-view metric learning multi-view video summarization. traditional methods video summarization designed generate summaries single-view video records; thus cannot fully exploit redundancy multi-view video records. paper, present multi-view metric learning framework multi-view video summarization combines advantages maximum margin clustering disagreement minimization criterion. learning framework thus ability find metric best separates data, meanwhile force learned metric maintain original intrinsic information data points, example geometric information. facilitated framework, systematic solution multi-view video summarization problem developed. best knowledge, first time address multi-view video summarization viewpoint metric learning. effectiveness proposed method demonstrated experiments.",4 "transductive zero-shot action recognition word-vector embedding. number categories action recognition growing rapidly become increasingly hard label sufficient training data learning conventional models categories. instead collecting ever data labelling exhaustively categories, attractive alternative approach zero-shot learning"" (zsl). end, study construct mapping visual features semantic descriptor action category, allowing new categories recognised absence visual training data. existing zsl studies focus primarily still images, attribute-based semantic representations. work, explore word-vectors shared semantic space embed videos category labels zsl action recognition. challenging problem existing zsl still images and/or attributes, mapping video spacetime features actions semantic space complex harder learn purpose generalising cross-category domain shift. solve generalisation problem zsl action recognition, investigate series synergistic strategies improve upon standard zsl pipeline. strategies transductive nature means access testing data training phase.",4 "variational bi-lstms. recurrent neural networks like long short-term memory (lstm) important architectures sequential prediction tasks. lstms (and rnns general) model sequences along forward time direction. bidirectional lstms (bi-lstms) hand model sequences along forward backward directions generally known perform better tasks capture richer representation data. training bi-lstms, forward backward paths learned independently. propose variant bi-lstm architecture, call variational bi-lstm, creates channel two paths (during training, may omitted inference); thus optimizing two paths jointly. arrive joint objective model minimizing variational lower bound joint likelihood data sequence. model acts regularizer encourages two networks inform making respective predictions using distinct information. perform ablation studies better understand different components model evaluate method various benchmarks, showing state-of-the-art performance.",19 "rand-walk: latent variable model approach word embeddings. semantic word embeddings represent meaning word via vector, created diverse methods. many use nonlinear operations co-occurrence statistics, hand-tuned hyperparameters reweighting methods. paper proposes new generative model, dynamic version log-linear topic model of~\citet{mnih2007three}. methodological novelty use prior compute closed form expressions word statistics. provides theoretical justification nonlinear models like pmi, word2vec, glove, well hyperparameter choices. also helps explain low-dimensional semantic embeddings contain linear algebraic structure allows solution word analogies, shown by~\citet{mikolov2013efficient} many subsequent papers. experimental support provided generative model assumptions, important latent word vectors fairly uniformly dispersed space.",4 "hypergraph-partitioned vertex programming approach large-scale consensus optimization. modern data science problems, techniques extracting value big data require performing large-scale optimization heterogenous, irregularly structured data. much data best represented multi-relational graphs, making vertex programming abstractions pregel graphlab ideal fits modern large-scale data analysis. paper, describe vertex-programming implementation popular consensus optimization technique known alternating direction multipliers (admm). admm consensus optimization allows elegant solution complex objectives inference rich probabilistic models. also introduce novel hypergraph partitioning technique improves state-of-the-art partitioning techniques vertex programming significantly reduces communication cost reducing number replicated nodes order magnitude. implemented algorithm graphlab measure scaling performance variety realistic bipartite graph distributions large synthetic voter-opinion analysis application. experiments, able achieve 50% improvement runtime current state-of-the-art graphlab partitioning scheme.",4 "sparsey: event recognition via deep hierarchical spare distributed codes. visual cortex's hierarchical, multi-level organization captured many biologically inspired computational vision models, general idea progressively larger scale, complex spatiotemporal features represented progressively higher areas. however, earlier models use localist representations (codes) representational field, equate cortical macrocolumn (mac), level. localism, represented feature/event (item) coded single unit. model, sparsey, also hierarchical crucially, uses sparse distributed coding (sdc) every mac levels. sdc, represented item coded small subset mac's units. sdcs different items overlap size overlap items represent similarity. difference localism sdc crucial sdc allows two essential operations associative memory, storing new item retrieving best-matching stored item, done fixed time life model. since model's core algorithm, storage retrieval (inference), makes single pass macs time step, overall model's storage/retrieval operation also fixed-time, criterion consider essential scalability huge datasets. 2010 paper described nonhierarchical version model context purely spatial pattern processing. here, elaborate fully hierarchical model (arbitrary numbers levels macs per level), describing novel model principles like progressive critical periods, dynamic modulation principal cells' activation functions based mac-level familiarity measure, representation multiple simultaneously active hypotheses, novel method time warp invariant recognition, report results showing learning/recognition spatiotemporal patterns.",16 "iot endpoint system-on-chip secure energy-efficient near-sensor analytics. near-sensor data analytics promising direction iot endpoints, minimizes energy spent communication reduces network load - also poses security concerns, valuable data stored sent network various stages analytics pipeline. using encryption protect sensitive data boundary on-chip analytics engine way address data security issues. cope combined workload analytics encryption tight power envelope, propose fulmine, system-on-chip based tightly-coupled multi-core cluster augmented specialized blocks compute-intensive data processing encryption functions, supporting software programmability regular computing tasks. fulmine soc, fabricated 65nm technology, consumes less 20mw average 0.8v achieving efficiency 70pj/b encryption, 50pj/px convolution, 25mips/mw software. strong argument real-life flexible application platform, show experimental results three secure analytics use cases: secure autonomous aerial surveillance state-of-the-art deep cnn consuming 3.16pj per equivalent risc op; local cnn-based face detection secured remote recognition 5.74pj/op; seizure detection encrypted data collection eeg within 12.7pj/op.",4 "parameter estimation softmax decision-making models linear objective functions. eye towards human-centered automation, contribute development systematic means infer features human decision-making behavioral data. motivated common use softmax selection models human decision-making, study maximum likelihood parameter estimation problem softmax decision-making models linear objective functions. present conditions likelihood function convex. allow us provide sufficient conditions convergence resulting maximum likelihood estimator construct asymptotic distribution. case models nonlinear objective functions, show estimator applied linearizing nominal parameter value. apply estimator fit stochastic ucl (upper credible limit) model human decision-making human subject data. show statistically significant differences behavior across related, distinct, tasks.",12 "knowledge common knowledge distributed environment. reasoning knowledge seems play fundamental role distributed systems. indeed, reasoning central part informal intuitive arguments used design distributed protocols. communication distributed system viewed act transforming system's state knowledge. paper presents general framework formalizing reasoning knowledge distributed systems. argue states knowledge groups processors useful concepts design analysis distributed protocols. particular, distributed knowledge corresponds knowledge ``distributed'' among members group, common knowledge corresponds fact ``publicly known''. relationship common knowledge variety desirable actions distributed system illustrated. furthermore, shown that, formally speaking, practical systems common knowledge cannot attained. number weaker variants common knowledge attainable many cases interest introduced investigated.",4 "multimodal convolutional neural networks matching image sentence. paper, propose multimodal convolutional neural networks (m-cnns) matching image sentence. m-cnn provides end-to-end framework convolutional architectures exploit image representation, word composition, matching relations two modalities. specifically, consists one image cnn encoding image content, one matching cnn learning joint representation image sentence. matching cnn composes words different semantic fragments learns inter-modal relations image composed fragments different levels, thus fully exploit matching relations image sentence. experimental results benchmark databases bidirectional image sentence retrieval demonstrate proposed m-cnns effectively capture information necessary image sentence matching. specifically, proposed m-cnns bidirectional image sentence retrieval flickr30k microsoft coco databases achieve state-of-the-art performances.",4 graph-based denoising time-varying point clouds. noisy 3d point clouds arise many applications. may due errors constructing 3d model images simply imprecise depth sensors. point clouds given geometrical structure using graphs created similarity information points. paper introduces technique uses graph structure convex optimization methods denoise 3d point clouds. short discussion presents methods naturally generalize time-varying inputs 3d point cloud time series.,4 "continuation semantics multi-quantifier sentences: operation-based approaches. classical scope-assignment strategies multi-quantifier sentences involve quantifier phrase (qp)-movement. recent continuation-based approaches provide compelling alternative, interpret qp's situ - without resorting logical forms structures beyond overt syntax. continuation-based strategies divided two groups: locate source scope-ambiguity rules semantic composition attribute lexical entries quantifier words. paper, focus former operation-based approaches nature semantic operations involved. specifically, discuss three possible operation-based strategies multi-quantifier sentences, together relative merits costs.",12 "short-term memory persistent activity: evolution self-stopping self-sustaining activity spiking neural networks. memories brain separated two categories: short-term long-term memories. long-term memories remain lifetime, short-term ones exist milliseconds minutes. within short-term memory studies, debate neural structure could implement it. indeed, mechanisms responsible long-term memories appear inadequate task. instead, proposed short-term memories could sustained persistent activity group neurons. work, explore topology could sustain short-term memories, designing model specific hypotheses, darwinian evolution order obtain new insights implementation. evolved 10 networks capable retaining information fixed duration 2 11s. main finding evolution naturally created two functional modules network: one sustains information containing primarily excitatory neurons, other, responsible forgetting, composed mainly inhibitory neurons. demonstrates balance inhibition excitation plays important role cognition.",4 algorithmic stability hypothesis complexity. introduce notion algorithmic stability learning algorithms---that term \emph{argument stability}---that captures stability hypothesis output learning algorithm normed space functions hypotheses selected. main result paper bounds generalization error learning algorithm terms argument stability. bounds based martingale inequalities banach space hypotheses belong. apply general bounds bound performance learning algorithms based empirical risk minimization stochastic gradient descent.,19 "parameterized approach personalized variable length summarization soccer matches. present parameterized approach produce personalized variable length summaries soccer matches. approach based temporally segmenting soccer video 'plays', associating user-specifiable 'utility' type play using 'bin-packing' select subset plays add desired length maximizing overall utility (volume bin-packing terms). approach systematically allows user override default weights assigned type play individual preferences thus see highly personalized variable length summarization soccer matches. demonstrate approach based output end-to-end pipeline building produce summaries. though aspects overall end-to-end pipeline human assisted present, results clearly show proposed approach capable producing semantically meaningful compelling summaries. besides obvious use producing summaries superior league matches news broadcasts, anticipate work promote greater awareness local matches junior leagues producing consumable summaries them.",4 "vegac: visual saliency-based age, gender, facial expression classification using convolutional neural networks. paper explores use visual saliency classify age, gender facial expression facial images. multi-task classification, propose method vegac, based visual saliency. using deep multi-level network [1] off-the-shelf face detector [2], proposed method first detects face test image extracts cnn predictions cropped face. cnn vegac fine-tuned collected dataset different benchmarks. convolutional neural network (cnn) uses vgg-16 architecture [3] pre-trained imagenet image classification. demonstrate usefulness method age estimation, gender classification, facial expression classification. show obtain competitive result method selected benchmarks. models code publically available.",4 "modified splice extension non-stereo data noise robust speech recognition. paper, modification training process popular splice algorithm proposed noise robust speech recognition. modification based feature correlations, enables stereo-based algorithm improve performance noise conditions, especially unseen cases. further, modified framework extended work non-stereo datasets clean noisy training utterances, stereo counterparts, required. finally, mllr-based computationally efficient run-time noise adaptation method splice framework proposed. modified splice shows 8.6% absolute improvement splice test c aurora-2 database, 2.93% overall. non-stereo method shows 10.37% 6.93% absolute improvements aurora-2 aurora-4 baseline models respectively. run-time adaptation shows 9.89% absolute improvement modified framework compared splice test c, 4.96% overall w.r.t. standard mllr adaptation hmms.",4 "voice conversion unaligned corpora using variational autoencoding wasserstein generative adversarial networks. building voice conversion (vc) system non-parallel speech corpora challenging highly valuable real application scenarios. situations, source target speakers repeat texts may even speak different languages. case, one possible, although indirect, solution build generative model speech. generative models focus explaining observations latent variables instead learning pairwise transformation function, thereby bypassing requirement speech frame alignment. paper, propose non-parallel vc framework variational autoencoding wasserstein generative adversarial network (vaw-gan) explicitly considers vc objective building speech model. experimental results corroborate capability framework building vc system unaligned data, demonstrate improved conversion quality.",4 "optimal query complexity reconstructing hypergraphs. paper consider problem reconstructing hidden weighted hypergraph constant rank using additive queries. prove following: let $g$ weighted hidden hypergraph constant rank n vertices $m$ hyperedges. $m$ exists non-adaptive algorithm finds edges graph weights using $$ o(\frac{m\log n}{\log m}) $$ additive queries. solves open problem [s. choi, j. h. kim. optimal query complexity bounds finding graphs. {\em stoc}, 749--758,~2008]. weights hypergraph integers less $o(poly(n^d/m))$ $d$ rank hypergraph (and therefore unweighted hypergraphs) exists non-adaptive algorithm finds edges graph weights using $$ o(\frac{m\log \frac{n^d}{m}}{\log m}). $$ additive queries. using information theoretic bound query complexities tight.",4 "pyramidal gradient matching optical flow estimation. initializing optical flow field either sparse descriptor matching dense patch matches proved particularly useful capturing large displacements. paper, present pyramidal gradient matching approach provide dense matches highly accurate efficient optical flow estimation. novel contribution method image gradient used describe image patches proved able produce robust matching. therefore, method efficient methods adopt special features (like sift) patch distance metric. moreover, find image gradient scalable optical flow estimation, means use different levels gradient feature (for example, full gradients direction information gradients) obtain different complexity without dramatic changes accuracy. another contribution uncover secrets limited patchmatch thorough analysis design pyramidal matching framework based secrets. pyramidal matching framework aimed robust gradient matching effective grow inliers reject outliers. framework, present special enhancements outlier filtering gradient matching. initializing epicflow matches, experimental results show method efficient robust (ranking 1st clean pass final pass mpi sintel dataset among published methods).",4 "multi-label pixelwise classification reconstruction large-scale urban areas. object classification one many holy grails computer vision resulted large number algorithms proposed already. specifically recent years considerable progress area primarily due increased efficiency accessibility deep learning techniques. fact, single-label object classification [i.e. one object present image] state-of-the-art techniques employ deep neural networks reporting close human-like performance. specialized applications single-label object-level classification suffice; example cases image contains multiple intertwined objects different labels. paper, address complex problem multi-label pixelwise classification. present distinct solution based convolutional neural network (cnn) performing multi-label pixelwise classification application large-scale urban reconstruction. supervised learning approach followed training 13-layer cnn using lidar satellite images. empirical study conducted determine hyperparameters result optimal performance cnn. scale invariance introduced training network five different scales input labeled data. results six pixelwise classifications different scale. svm trained map six pixelwise classifications single-label. lastly, refine boundary pixel labels using graph-cuts maximum a-posteriori (map) estimation markov random field (mrf) priors. resulting pixelwise classification used accurately extract reconstruct buildings large-scale urban areas. proposed approach extensively tested results reported.",4 "probabilistic linear genetic programming stochastic context-free grammar solving symbolic regression problems. traditional linear genetic programming (lgp) algorithms based selection mechanism guide search. genetic operators combine mutate random portions individuals, without knowing result lead fitter individual. probabilistic model building genetic programming (pmb-gp) methods proposed overcome issue probability model captures structure fit individuals use sample new individuals. work proposes use lgp stochastic context-free grammar (scfg), probability distribution updated according selected individuals. proposed method adapting grammar linear representation lgp. tests performed proposed probabilistic method, two hybrid approaches, several symbolic regression benchmark problems show results statistically better obtained traditional lgp.",4 "deep convolutional neural network using directional wavelets low-dose x-ray ct reconstruction. due potential risk inducing cancers, radiation dose x-ray ct reduced routine patient scanning. however, low-dose x-ray ct, severe artifacts usually occur due photon starvation, beamhardening, etc, decrease reliability diagnosis. thus, high quality reconstruction low-dose x-ray ct data become one important research topics ct community. conventional model-based denoising approaches are, however, computationally expensive, image domain denoising approaches hardly deal ct specific noise patterns. address issues, propose algorithm using deep convolutional neural network (cnn), applied wavelet transform coefficients low-dose ct images. specifically, using directional wavelet transform extracting directional component artifacts exploiting intra- inter-band correlations, deep network effectively suppress ct specific noises. moreover, cnn designed various types residual learning architecture faster network training better denoising. experimental results confirm proposed algorithm effectively removes complex noise patterns ct images, originated reduced x-ray dose. addition, show wavelet domain cnn efficient removing noises low-dose ct compared image domain cnn. results rigorously evaluated several radiologists second place award 2016 aapm low-dose ct grand challenge. best knowledge, work first deep learning architecture low-dose ct reconstruction rigorously evaluated proven efficacy.",4 "probabilistic dimensionality reduction via structure learning. propose novel probabilistic dimensionality reduction framework naturally integrate generative model locality information data. based framework, present new model, able learn smooth skeleton embedding points low-dimensional space high-dimensional noisy data. formulation new model equivalently interpreted two coupled learning problem, i.e., structure learning learning projection matrix. interpretation motivates learning embedding points directly form explicit graph structure. develop new method learn embedding points form spanning tree, extended obtain discriminative compact feature representation clustering problems. unlike traditional clustering methods, assume centers clusters close connected learned graph, cluster centers distant. greatly facilitate data visualization scientific discovery downstream analysis. extensive experiments performed demonstrate proposed framework able obtain discriminative feature representations, correctly recover intrinsic structures various real-world datasets.",19 "large-scale music annotation retrieval: learning rank joint semantic spaces. music prediction tasks range predicting tags given song clip audio, predicting name artist, predicting related songs given song, clip, artist name tag. is, interested every semantic relationship different musical concepts database. realistically sized databases, number songs measured hundreds thousands more, number artists tens thousands more, providing considerable challenge standard machine learning techniques. work, propose method scales datasets attempts capture semantic similarities database items modeling audio, artist names, tags single low-dimensional semantic space. choice space learnt optimizing set prediction tasks interest jointly using multi-task learning. method outperforms baseline methods and, comparison them, faster consumes less memory. demonstrate method learns interpretable model, semantic space captures well similarities interest.",4 "note alternating minimization algorithm matrix completion problem. consider problem reconstructing low rank matrix subset entries analyze two variants so-called alternating minimization algorithm, proposed past. establish underlying matrix rank $r=1$, positive bounded entries, graph $\mathcal{g}$ underlying revealed entries bounded degree diameter logarithmic size matrix, algorithms succeed reconstructing matrix approximately polynomial time starting arbitrary initialization. provide simulation results suggest second algorithm based message passing type updates, performs significantly better.",19 "deep learning isotropic super-resolution non-isotropic 3d electron microscopy. sophisticated existing methods generate 3d isotropic super-resolution (sr) non-isotropic electron microscopy (em) based learned dictionaries. unfortunately, none existing methods generate practically satisfying results. 2d natural images, recently developed super-resolution methods use deep learning shown significantly outperform previous state art. adapted one successful architectures (fsrcnn) 3d super-resolution, compared performance 3d u-net architecture used previously generate super-resolution. trained architectures artificially downscaled isotropic ground truth focused ion beam milling scanning em (fib-sem) tested performance various hyperparameter settings. results indicate architectures successfully generate 3d isotropic super-resolution non-isotropic em, u-net performing consistently better. propose several promising directions practical application.",4 "intrusions marked renewal processes. present probabilistic model intrusion marked renewal process. given process sequence events, intrusion subsequence events produced process. applications model are, example, online payment fraud fraudster taking user's account performing payments user's behalf, unexpected equipment failures due unintended use. adopt bayesian approach infer probability intrusion sequence events, map subsequence events constituting intrusion, marginal probability event sequence belong intrusion. evaluate model intrusion detection synthetic data, well anonymized data online payment system.",4 "evidence size principle semantic perceptual domains. shepard's universal law generalization offered compelling case first physics-like law cognitive science hold intelligent agents universe. shepard's account based rational bayesian model generalization, providing answer question law emerge. extending account explain humans use multiple examples make better generalizations requires additional assumption, called size principle: hypotheses pick fewer objects make larger contribution generalization. degree principle warrants similarly law-like status far conclusive. typically, evaluating principle straightforward, requiring additional assumptions. present new method evaluating size principle direct, apply method diverse array datasets. results provide support broad applicability size principle.",4 "aerial spectral super-resolution using conditional adversarial networks. inferring spectral signatures ground based natural images acquired lot interest applied deep learning. contrast spectra ground based images, aerial spectral images low spatial resolution suffer higher noise interference. paper, train conditional adversarial network learn inverse mapping trichromatic space 31 spectral bands within 400 700 nm. network trained aerocampus, first kind aerial hyperspectral dataset. aerocampus consists high spatial resolution color images low spatial resolution hyperspectral images (hsi). color images synthesized 31 spectral bands used train network. baseline root mean square error 2.48 synthesized rgb test data, show possible generate spectral signatures aerial imagery.",4 "gpu-based image analysis mobile devices. rapid advances mobile technology many mobile devices capable capturing high quality images video embedded camera. paper investigates techniques real-time processing resulting images, particularly on-device utilizing graphical processing unit. issues limitations image processing mobile devices discussed, performance graphical processing units range devices measured programmable shader implementation canny edge detection.",4 "binary excess risk smooth convex surrogates. statistical learning theory, convex surrogates 0-1 loss highly preferred computational theoretical virtues convexity brings in. importance consider smooth surrogates witnessed fact smoothness beneficial computationally- attaining {\it optimal} convergence rate optimization, statistical sense- providing improved {\it optimistic} rate generalization bound. paper investigate smoothness property viewpoint statistical consistency show affects binary excess risk. show contrast optimization generalization errors favor choice smooth surrogate loss, smoothness loss function may degrade binary excess risk. motivated negative result, provide unified analysis integrates optimization error, generalization bound, error translating convex excess risk binary excess risk examining impact smoothness binary excess risk. show favorable conditions appropriate choice smooth convex loss result binary excess risk better $o(1/\sqrt{n})$.",4 "sparse image representation epitomes. sparse coding, decomposition vector using basis elements, widely used machine learning image processing. basis set, also called dictionary, learned adapt specific data. approach proven effective many image processing tasks. traditionally, dictionary unstructured ""flat"" set atoms. paper, study structured dictionaries obtained epitome, set epitomes. epitome small image, atoms patches chosen size inside image. considerably reduces number parameters learn provides sparse image decompositions shiftinvariance properties. propose new formulation algorithm learning structured dictionaries associated epitomes, illustrate use image denoising tasks.",4 "autoperf: generalized zero-positive learning system detect software performance anomalies. paper, present autoperf, generalized software performance anomaly detection system. autoperf uses autoencoders, unsupervised learning technique, hardware performance counters learn performance signatures parallel programs. uses knowledge identify newer versions program suffer performance penalties, simultaneously providing root cause analysis help programmers debug program's performance. autoperf first zero-positive learning performance anomaly detector, system trains entirely negative (non-anomalous) space learn positive (anomalous) behaviors. demonstrate autoperf's generality three different types performance anomalies: (i) true sharing cache contention, (ii) false sharing cache contention, (iii) numa latencies across 15 real world performance anomalies 7 open source programs. autoperf 3.7% profiling overhead (on average) detects anomalies prior state-of-the-art approach.",4 "supervised ibp: neighbourhood preserving infinite latent feature models. propose probabilistic model infer supervised latent variables hamming space observed data. model allows simultaneous inference number binary latent variables, values. latent variables preserve neighbourhood structure data sense objects semantic concept similar latent values, objects different concepts dissimilar latent values. formulate supervised infinite latent variable problem based intuitive principle pulling objects together type, pushing apart not. combine principle flexible indian buffet process prior latent variables. show inferred supervised latent variables directly used perform nearest neighbour search purpose retrieval. introduce new application dynamically extending hash codes, show effectively couple structure hash codes continuously growing structure neighbourhood preserving infinite latent feature space.",4 "non-distributional word vector representations. data-driven representation learning words technique central importance nlp. indisputably useful source features downstream tasks, vectors tend consist uninterpretable components whose relationship categories traditional lexical semantic theories tenuous best. present method constructing interpretable word vectors hand-crafted linguistic resources like wordnet, framenet etc. vectors binary (i.e, contain 0 1) 99.9% sparse. analyze performance state-of-the-art evaluation methods distributional models word vectors find competitive standard distributional approaches.",4 "syntactic structures code parameters. assign binary ternary error-correcting codes data syntactic structures world languages study distribution code points space code parameters. show that, codes populate lower region approximating superposition thomae functions, substantial presence codes gilbert-varshamov bound even asymptotic bound plotkin bound. investigate dynamics induced space code parameters spin glass models language change, show that, presence entailment relations syntactic parameters dynamics sometimes improve code. large sets languages syntactic data, one gain information spin glass dynamics induced dynamics space code parameters.",4 "comparison three methods clustering: k-means, spectral clustering hierarchical clustering. comparison three kind clustering find cost function loss function calculate them. error rate clustering methods calculate error percentage always one important factor evaluating clustering methods, paper introduce one way calculate error rate clustering methods. clustering algorithms divided several categories including partitioning clustering algorithms, hierarchical algorithms density based algorithms. generally speaking compare clustering algorithms scalability, ability work different attribute, clusters formed conventional, minimal knowledge computer recognize input parameters, classes dealing noise extra deposition error rate clustering new data, thus, effect input data, different dimensions high levels, k-means one simplest approach clustering clustering unsupervised problem.",4 "understanding deep neural networks rectified linear units. paper investigate family functions representable deep neural networks (dnn) rectified linear units (relu). give algorithm train relu dnn one hidden layer *global optimality* runtime polynomial data size albeit exponential input dimension. further, improve known lower bounds size (from exponential super exponential) approximating relu deep net function shallower relu net. gap theorems hold smoothly parametrized families ""hard"" functions, contrary countable, discrete families known literature. example consequence gap theorems following: every natural number $k$ exists function representable relu dnn $k^2$ hidden layers total size $k^3$, relu dnn $k$ hidden layers require least $\frac{1}{2}k^{k+1}-1$ total nodes. finally, family $\mathbb{r}^n\to \mathbb{r}$ dnns relu activations, show new lowerbound number affine pieces, larger previous constructions certain regimes network architecture distinctively lowerbound demonstrated explicit construction *smoothly parameterized* family functions attaining scaling. construction utilizes theory zonotopes polyhedral theory.",4 "batch, off-policy, actor-critic algorithm optimizing average reward. develop off-policy actor-critic algorithm learning optimal policy training set composed data multiple individuals. algorithm developed view towards use mobile health.",19 "multi-spectral image panchromatic sharpening, outcome process quality assessment protocol. multispectral (ms) image panchromatic (pan) sharpening algorithms proposed remote sensing community ever increasing number variety. aim sharpen coarse spatial resolution ms image fine spatial resolution pan image acquired simultaneously spaceborne airborne earth observation (eo) optical imaging sensor pair. unfortunately, date, standard evaluation procedure ms image pan sharpening outcome process community agreed upon, contrast quality assurance framework earth observation (qa4eo) guidelines proposed intergovernmental group earth observations (geo). general, process easier measure, outcome important. original contribution present study fourfold. first, existing procedures quantitative quality assessment (q2a) (sole) pan sharpened ms product critically reviewed. conceptual implementation drawbacks highlighted overcome quality improvement. second, novel (to best authors' knowledge, first) protocol q2a ms image pan sharpening product process designed, implemented validated independent means. third, within protocol, innovative categorization spectral spatial image quality indicators metrics presented. fourth, according new taxonomy, original third order isotropic multi scale gray level co occurrence matrix (tims glcm) calculator tims glcm texture feature extractor proposed replace popular second order glcms.",4 "review evaluation techniques social dialogue systems. contrast goal-oriented dialogue, social dialogue clear measure task success. consequently, evaluation systems notoriously hard. paper, review current evaluation methods, focusing automatic metrics. conclude turn-based metrics often ignore context account fact several replies valid, end-of-dialogue rewards mainly hand-crafted. lack grounding human perceptions.",4 "segan: speech enhancement generative adversarial network. current speech enhancement techniques operate spectral domain and/or exploit higher-level feature. majority tackle limited number noise conditions rely first-order statistics. circumvent issues, deep networks increasingly used, thanks ability learn complex functions large example sets. work, propose use generative adversarial networks speech enhancement. contrast current techniques, operate waveform level, training model end-to-end, incorporate 28 speakers 40 different noise conditions model, model parameters shared across them. evaluate proposed model using independent, unseen test set two speakers 20 alternative noise conditions. enhanced samples confirm viability proposed model, objective subjective evaluations confirm effectiveness it. that, open exploration generative architectures speech enhancement, may progressively incorporate speech-centric design choices improve performance.",4 "screen content image segmentation using sparse-smooth decomposition. sparse decomposition extensively used different applications including signal compression denoising document analysis. paper, sparse decomposition used image segmentation. proposed algorithm separates background foreground using sparse-smooth decomposition technique smooth sparse components correspond background foreground respectively. algorithm tested several test images hevc test sequences shown superior performance methods, hierarchical k-means clustering djvu. segmentation algorithm also used text extraction, video compression medical image segmentation.",4 "causal models complete axiomatic characterization. markov networks bayesian networks effective graphic representations dependencies embedded probabilistic models. well known independencies captured markov networks (called graph-isomorphs) finite axiomatic characterization. paper, however, shows independencies captured bayesian networks (called causal models) axiomatization using even countably many horn disjunctive clauses. sub-independency model causal model may causal, graph-isomorphs closed sub-models.",4 "meta-learning phonemic annotation corpora. apply rule induction, classifier combination meta-learning (stacked classifiers) problem bootstrapping high accuracy automatic annotation corpora pronunciation information. task address paper consists generating phonemic representations reflecting flemish dutch pronunciations word basis orthographic representation (which turn based actual speech recordings). compare several possible approaches achieve text-to-pronunciation mapping task: memory-based learning, transformation-based learning, rule induction, maximum entropy modeling, combination classifiers stacked learning, stacking meta-learners. interested optimal accuracy obtaining insight linguistic regularities involved. far accuracy concerned, already high accuracy level (93% celex 86% fonilex word level) single classifiers boosted significantly additional error reductions 31% 38% respectively using combination classifiers, 5% using combination meta-learners, bringing overall word level accuracy 96% dutch variant 92% flemish variant. also show application machine learning methods indeed leads increased insight linguistic regularities determining variation two pronunciation variants studied.",4 "neuron pruning compressing deep networks using maxout architectures. paper presents efficient robust approach reducing size deep neural networks pruning entire neurons. exploits maxout units combining neurons complex convex functions makes use local relevance measurement ranks neurons according activation training set pruning them. additionally, parameter reduction comparison neuron weight pruning shown. empirically shown proposed neuron pruning reduces number parameters dramatically. evaluation performed two tasks, mnist handwritten digit recognition lfw face verification, using lenet-5 vgg16 network architecture. network size reduced $74\%$ $61\%$, respectively, without affecting network's performance. main advantage neuron pruning direct influence size network architecture. furthermore, shown neuron pruning combined subsequent weight pruning, reducing size lenet-5 vgg16 $92\%$ $80\%$ respectively.",4 "fragments count?. aim finding minimal set fragments achieves maximal parse accuracy data oriented parsing. experiments penn wall street journal treebank show counts almost arbitrary fragments within parse trees important, leading improved parse accuracy previous models tested treebank. isolate number dependency relations previous models neglect contribute higher parse accuracy.",4 "vision recognition using discriminant sparse optimization learning. better select correct training sample obtain robust representation query sample, paper proposes discriminant-based sparse optimization learning model. learning model integrates discriminant sparsity together. based model, propose classifier called locality-based discriminant sparse representation (ldsr). discriminant help increase difference samples different classes decrease difference samples within class, ldsr obtain better sparse coefficients constitute better sparse representation classification. order take advantages kernel techniques, discriminant sparsity, propose nonlinear classifier called kernel locality-based discriminant sparse representation (kldsr). experiments several well-known databases prove performance ldsr kldsr better several state-of-the-art methods including deep learning based methods.",4 "deephash: getting regularization, depth fine-tuning right. work focuses representing high-dimensional global image descriptors using compact 64-1024 bit binary hashes instance retrieval. propose deephash: hashing scheme based deep networks. key making deephash work extremely low bitrates three important considerations -- regularization, depth fine-tuning -- requiring solutions specific hashing problem. in-depth evaluation shows scheme consistently outperforms state-of-the-art methods across data sets fisher vectors deep convolutional neural network features, 20 percent schemes. retrieval performance 256-bit hashes close uncompressed floating point features -- remarkable 512 times compression.",4 "short review ethical challenges clinical natural language processing. clinical nlp immense potential contributing clinical practice revolutionized advent large scale processing clinical records. however, potential remained largely untapped due slow progress primarily caused strict data access policies researchers. paper, discuss concern privacy measures entails. also suggest sources less sensitive data. finally, draw attention biases compromise validity empirical research lead socially harmful applications.",4 "encoder-decoder shift-reduce syntactic parsing. starting nmt, encoder-decoder neu- ral networks used many nlp problems. graph-based models transition-based models borrowing en- coder components achieve state-of-the-art performance dependency parsing constituent parsing, respectively. how- ever, work empirically studying encoder-decoder neural net- works transition-based parsing. apply simple encoder-decoder end, achieving comparable results parser dyer et al. (2015) standard de- pendency parsing, outperforming parser vinyals et al. (2015) con- stituent parsing.",4 "diffusion convolutional recurrent neural network: data-driven traffic forecasting. spatiotemporal forecasting various applications neuroscience, climate transportation domain. traffic forecasting one canonical example learning task. task challenging due (1) complex spatial dependency road networks, (2) non-linear temporal dynamics changing road conditions (3) inherent difficulty long-term forecasting. address challenges, propose model traffic flow diffusion process directed graph introduce diffusion convolutional recurrent neural network (dcrnn), deep learning framework traffic forecasting incorporates spatial temporal dependency traffic flow. specifically, dcrnn captures spatial dependency using bidirectional random walks graph, temporal dependency using encoder-decoder architecture scheduled sampling. evaluate framework two real-world large scale road network traffic datasets observe consistent improvement 12% - 15% state-of-the-art baselines.",4 "chatpainter: improving text image generation using dialogue. synthesizing realistic images text descriptions dataset like microsoft common objects context (ms coco), image contain several objects, challenging task. prior work used text captions generate images. however, captions might informative enough capture entire image insufficient model able understand objects images correspond words captions. show adding dialogue describes scene leads significant improvement inception score quality generated images ms coco dataset.",4 "retrieval registration long-range overlapping frames scalable mosaicking vivo fetoscopy. purpose: standard clinical treatment twin-to-twin transfusion syndrome consists photo-coagulation undesired anastomoses located placenta responsible blood transfer two twins. standard care procedure, fetoscopy suffers limited field-of-view placenta resulting missed anastomoses. facilitate task clinician, building global map placenta providing larger overview vascular network highly desired. methods: overcome challenging visual conditions inherent vivo sequences (low contrast, obstructions presence artifacts, among others), propose following contributions: (i) robust pairwise registration achieved aligning orientation image gradients, (ii) difficulties regarding long-range consistency (e.g. due presence outliers) tackled via bag-of-word strategy, identifies overlapping frames sequence registered regardless respective location time. results: addition visual difficulties, vivo sequences characterised intrinsic absence gold standard. present mosaics motivating qualitatively methodological choices demonstrating promising aspect. also demonstrate semi-quantitatively, via visual inspection registration results, efficacy registration approach comparison two standard baselines. conclusion: paper proposes first approach construction mosaics placenta vivo fetoscopy sequences. robustness visual challenges registration long-range temporal consistency proposed, offering first positive results vivo data standard mosaicking techniques applicable.",4 "boosting trees anti-spam email filtering. paper describes set comparative experiments problem automatically filtering unwanted electronic mail messages. several variants adaboost algorithm confidence-rated predictions [schapire & singer, 99] applied, differ complexity base learners considered. two main conclusions drawn experiments: a) boosting-based methods clearly outperform baseline learning algorithms (naive bayes induction decision trees) pu1 corpus, achieving high levels f1 measure; b) increasing complexity base learners allows obtain better ``high-precision'' classifiers, important issue misclassification costs considered.",4 "relations fp-soft sets applied decision making problems. work, first define relations fuzzy parametrized soft sets study properties. also give decision making method based relations. approximate reasoning, relations fuzzy parametrized soft sets shown primordial importance. finally, method successfully applied problems contain uncertainties.",12 "arco1: application belief networks oil market. belief networks new, potentially important, class knowledge-based models. arco1, currently development atlantic richfield company (arco) university southern california (usc), advanced reported implementation models financial forecasting setting. arco1's underlying belief network models variables believed impact crude oil market. pictorial market model-developed mac ii- facilitates consensus among members forecasting team. system forecasts crude oil prices via monte carlo analyses network. several different models oil market developed; system's ability updated quickly highlights flexibility.",4 "intraoperative margin assessment human breast tissue optical coherence tomography images using deep neural networks. objective: work, perform margin assessment human breast tissue optical coherence tomography (oct) images using deep neural networks (dnns). work simulates intraoperative setting breast cancer lumpectomy. methods: train dnns, use state-of-the-art methods (weight decay dropout) newly introduced regularization method based function norms. commonly used methods fail small database available. use function norm introduces direct control complexity function aim diminishing risk overfitting. results: neither code data previous results publicly available, obtained results compared reported results literature conservative comparison. moreover, method applied locally collected data several data configurations. reported results average different trials. conclusion: experimental results show use dnns yields significantly better results techniques evaluated terms sensitivity, specificity, f1 score, g-mean matthews correlation coefficient. function norm regularization yielded higher robust results competing methods. significance: demonstrated system shows high promise (partially) automated margin assessment human breast tissue, equal error rate (eer) reduced approximately 12\% (the lowest reported literature) 5\%\,--\,a 58\% reduction. method computationally feasible intraoperative application (less 2 seconds per image).",19 "worst-case upper bound (1, 2)-qsat. rigorous theoretical analysis algorithm subclass qsat, i.e. (1, 2)-qsat, proposed literature. (1, 2)-qsat, first introduced sat'08, seen quantified extended 2-cnf formulas. now, within knowledge, exists algorithm presenting worst upper bound (1, 2)-qsat. therefore paper, present exact algorithm solve (1, 2)-qsat. analyzing algorithms, obtain worst-case upper bound o(1.4142m), number clauses.",4 "context driven label fusion segmentation subcutaneous visceral fat ct volumes. quantification adipose tissue (fat) computed tomography (ct) scans conducted mostly manual semi-automated image segmentation algorithms limited efficacy. work, propose completely unsupervised automatic method identify adipose tissue, separate subcutaneous adipose tissue (sat) visceral adipose tissue (vat) abdominal region. offer three-phase pipeline consisting (1) initial boundary estimation using gradient points, (2) boundary refinement using geometric median absolute deviation appearance based local outlier scores (3) context driven label fusion using conditional random fields (crf) obtain final boundary sat vat. evaluate proposed method 151 abdominal ct scans obtain state-of-the-art 94% 91% dice similarity scores sat vat segmentation, well significant reduction fat quantification error measure.",4 "integral curvature representation matching algorithms identification dolphins whales. address problem identifying individual cetaceans images showing trailing edge fins. given trailing edge unknown individual, produce ranking known individuals database. nicks notches along trailing edge define individual's unique signature. define representation based integral curvature robust changes viewpoint pose, captures pattern nicks notches local neighborhood multiple scales. explore two ranking methods use representation. first uses dynamic programming time-warping algorithm align two representations, interprets alignment cost measure similarity. algorithm also exploits learned spatial weights downweight matches regions unstable curvature. second interprets representation feature descriptor. feature keypoints defined local extrema representation. descriptors set known individuals stored tree structure, allows us perform queries given descriptors unknown trailing edge. evaluate top-k accuracy two real-world datasets demonstrate effectiveness curvature representation, achieving top-1 accuracy scores approximately 95% 80% bottlenose dolphins humpback whales, respectively.",4 historical dynamics lexical system random walk process. offered consider word meanings changes diachrony semicontinuous random walk reflecting swallowing screens. basic characteristics word life cycle defined. verification model realized data russian words distribution various age periods.,4 "learning using privileged information: svm+ weighted svm. prior knowledge used improve predictive performance learning algorithms reduce amount data required training. goal pursued within learning using privileged information paradigm recently introduced vapnik et al. aimed utilizing additional information available training time -- framework implemented svm+. relate privileged information importance weighting show prior knowledge expressible privileged features also encoded weights associated every training example. show weighted svm always replicate svm+ solution, converse true construct counterexample highlighting limitations svm+. finally, touch problem choosing weights weighted svms privileged features available.",19 "variable importance binary regression trees forests. characterize study variable importance (vimp) pairwise variable associations binary regression trees. key component involves node mean squared error quantity refer maximal subtree. theory naturally extends single trees ensembles trees applies methods like random forests. useful importance values random forests used screen variables, example used filter high throughput genomic data bioinformatics, little theory exists properties.",19 "contextual bandits latent confounders: nmf approach. motivated online recommendation advertising systems, consider causal model stochastic contextual bandits latent low-dimensional confounder. model, $l$ observed contexts $k$ arms bandit. observed context influences reward obtained latent confounder variable cardinality $m$ ($m \ll l,k$). arm choice latent confounder causally determines reward observed context correlated confounder. model, $l \times k$ mean reward matrix $\mathbf{u}$ (for context $[l]$ arm $[k]$) factorizes non-negative factors $\mathbf{a}$ ($l \times m$) $\mathbf{w}$ ($m \times k$). insight enables us propose $\epsilon$-greedy nmf-bandit algorithm designs sequence interventions (selecting specific arms), achieves balance learning low-dimensional structure selecting best arm minimize regret. algorithm achieves regret $\mathcal{o}\left(l\mathrm{poly}(m, \log k) \log \right)$ time $t$, compared $\mathcal{o}(lk\log t)$ conventional contextual bandits, assuming constant gap best arm rest context. guarantees obtained mild sufficiency conditions factors weaker versions well-known statistical rip condition. propose class generative models satisfy sufficient conditions, derive lower bound $\mathcal{o}\left(km\log t\right)$. first regret guarantees online matrix completion bandit feedback, rank greater one. compare performance algorithm state art, synthetic real world data-sets.",4 "improving image generative models human interactions. gans provide framework training generative models mimic data distribution. however, many cases wish train generative models optimize auxiliary objective function within data generates, making aesthetically pleasing images. cases, objective functions difficult evaluate, e.g. may require human interaction. here, develop system efficiently improving gan target objective involving human interaction, specifically generating images increase rates positive user interactions. improve generative model, build model human behavior targeted domain relatively small set interactions, use behavioral model auxiliary loss function improve generative model. show system successful improving positive interaction rates, least simulated data, characterize factors affect performance.",4 "transition-based dependency parsing pluggable classifiers. principle, design transition-based dependency parsers makes possible experiment general-purpose classifier without changes parsing algorithm. practice, however, often takes substantial software engineering bridge different representations used two software packages. present extensions maltparser allow drop-in use classifier conforming interface weka machine learning package, wrapper timbl memory-based learner interface, experiments multilingual dependency parsing variety classifiers. earlier work suggested memory-based learners might good choice low-resource parsing scenarios, cannot support hypothesis work. observed support-vector machines give better parsing performance memory-based learner, regardless size training set.",4 "extracting urban impervious surface gf-1 imagery using one-class classifiers. impervious surface area direct consequence urbanization, also plays important role urban planning environmental management. rapidly technical development remote sensing, monitoring urban impervious surface via high spatial resolution (hsr) images attracted unprecedented attention recently. traditional multi-classes models inefficient impervious surface extraction requires labeling needed unneeded classes occur image exhaustively. therefore, need find reliable one-class model classify one specific land cover type without labeling classes. study, investigate several one-class classifiers, presence background learning (pbl), positive unlabeled learning (pul), ocsvm, bsvm maxent, extract urban impervious surface area using high spatial resolution imagery gf-1, china's new generation high spatial remote sensing satellite, evaluate classification accuracy based artificial interpretation results. compared traditional multi-classes classifiers (ann svm), experimental results indicate pbl pul provide higher classification accuracy, similar accuracy provided ann model. meanwhile, pbl pul outperforms ocsvm, bsvm, maxent svm models. hence, one-class classifiers need small set specific samples train models without losing predictive accuracy, supposed gain attention urban impervious surface extraction one specific land cover type.",4 "optimal learning rates localized svms. one limiting factors using support vector machines (svms) large scale applications super-linear computational requirements terms number training samples. address issue, several approaches train svms many small chunks large data sets separately proposed literature. far, however, almost approaches empirically investigated. addition, motivation always based computational requirements. work, consider localized svm approach based upon partition input space. local svm, derive general oracle inequality. apply oracle inequality least squares regression using gaussian kernels deduce local learning rates essentially minimax optimal standard smoothness assumptions regression function. gives first motivation using local svms based computational requirements theoretical predictions generalization performance. introduce data-dependent parameter selection method local svm approach show method achieves learning rates before. finally, present larger scale experiments localized svm showing achieves essentially test performance global svm fraction computational requirements. addition, turns computational requirements local svms similar vanilla random chunk approach, achieved test errors significantly better.",19 "hierarchical approach joint multi-view object pose estimation categorization. propose joint object pose estimation categorization approach extracts information object poses categories object parts compositions constructed different layers hierarchical object representation algorithm, namely learned hierarchy parts (lhop). proposed approach, first employ lhop learn hierarchical part libraries represent entity parts compositions across different object categories views. then, extract statistical geometric features part realizations objects images order represent information object pose category different layer hierarchy. unlike traditional approaches consider specific layers hierarchies order extract information perform specific tasks, combine information extracted different layers solve joint object pose estimation categorization problem using distributed optimization algorithms. examine proposed generative-discriminative learning approach algorithms two benchmark 2-d multi-view image datasets. proposed approach algorithms outperform state-of-the-art classification, regression feature extraction algorithms. addition, experimental results shed light relationship object categorization, pose estimation part realizations observed different layers hierarchy.",4 "assessing threat adversarial examples deep neural networks. deep neural networks facing potential security threat adversarial examples, inputs look normal cause incorrect classification deep neural network. example, proposed threat could result hand-written digits scanned check incorrectly classified looking normal humans see them. research assesses extent adversarial examples pose security threat, one considers normal image acquisition process. process mimicked simulating transformations normally occur acquiring image real world application, using scanner acquire digits check amount using camera autonomous car. small transformations negate effect carefully crafted perturbations adversarial examples, resulting correct classification deep neural network. thus acquiring image decreases potential impact proposed security threat. also show already widely used process averaging multiple crops neutralizes adversarial examples. normal preprocessing, text binarization, almost completely neutralizes adversarial examples. first paper show text driven classification, adversarial examples academic curiosity, security threat.",4 "sample complexity end-to-end training vs. semantic abstraction training. compare end-to-end training approach modular approach system decomposed semantically meaningful components. focus sample complexity aspect, regime extremely high accuracy necessary, case autonomous driving applications. demonstrate cases number training examples required end-to-end approach exponentially larger number examples required semantic abstraction approach.",4 "empirical evaluation various deep learning architectures bi-sequence classification tasks. several tasks argumentation mining debating, question-answering, natural language inference involve classifying sequence context another sequence (referred bi-sequence classification). several single sequence classification tasks, current state-of-the-art approaches based recurrent convolutional neural networks. hand, bi-sequence classification problems, much understanding best deep learning architecture. paper, attempt get understanding category problems extensive empirical evaluation 19 different deep learning architectures (specifically different ways handling context) various problems originating natural language processing like debating, textual entailment question-answering. following empirical evaluation, offer insights conclusions regarding architectures considered. also establish first deep learning baselines three argumentation mining tasks.",4 "beyond word-based language model statistical machine translation. language model one important modules statistical machine translation currently word-based language model dominants community. however, many translation models (e.g. phrase-based models) generate target language sentences rendering compositing phrases rather words. thus, much reasonable model dependency phrases, research work succeed solving problem. paper, tackle problem designing novel phrase-based language model attempts solve three key sub-problems: 1, define phrase language model; 2, determine phrase boundary large-scale monolingual data order enlarge training set; 3, alleviate data sparsity problem due huge vocabulary size phrases. carefully handling issues, extensive experiments chinese-to-english translation show phrase-based language model significantly improve translation quality +1.47 absolute bleu score.",4 "batchwise monotone algorithm dictionary learning. propose batchwise monotone algorithm dictionary learning. unlike state-of-the-art dictionary learning algorithms impose sparsity constraints sample-by-sample basis, instead treat samples batch, impose sparsity constraint whole. benefit batchwise optimization non-zeros better allocated across samples, leading better approximation whole. accomplish this, propose procedures switch non-zeros rows columns support coefficient matrix reduce reconstruction error. prove proposed support switching procedure objective algorithm, i.e., reconstruction error, decreases monotonically converges. furthermore, introduce block orthogonal matching pursuit algorithm also operates sample batches provide warm start. experiments natural image patches uci data sets show proposed algorithm produces better approximation sparsity levels compared state-of-the-art algorithms.",4 "fuzzy soft rough k-means clustering approach gene expression data. clustering one widely used data mining techniques medical diagnosis. clustering considered important unsupervised learning technique. clustering methods group data based distance methods cluster data based similarity. clustering algorithms classify gene expression data clusters functionally related genes grouped together efficient manner. groupings constructed degree relationship strong among members cluster weak among members different clusters. work, focus similarity relationship among genes similar expression patterns consequential simple analytical decision made proposed fuzzy soft rough k-means algorithm. algorithm developed based fuzzy soft sets rough sets. comparative analysis proposed work made bench mark algorithms like k-means rough k-means efficiency proposed algorithm illustrated work using various cluster validity measures db index xie-beni index.",4 "large scale distributed semi-supervised learning using streaming approximation. traditional graph-based semi-supervised learning (ssl) approaches, even though widely applied, suited massive data large label scenarios since scale linearly number edges $|e|$ distinct labels $m$. deal large label size problem, recent works propose sketch-based methods approximate distribution labels per node thereby achieving space reduction $o(m)$ $o(\log m)$, certain conditions. paper, present novel streaming graph-based ssl approximation captures sparsity label distribution ensures algorithm propagates labels accurately, reduces space complexity per node $o(1)$. also provide distributed version algorithm scales well large data sizes. experiments real-world datasets demonstrate new method achieves better performance existing state-of-the-art algorithms significant reduction memory footprint. also study different graph construction mechanisms natural language applications propose robust graph augmentation strategy trained using state-of-the-art unsupervised deep learning architectures yields significant quality gains.",4 "active neural localization. localization problem estimating location autonomous agent observation map environment. traditional methods localization, filter belief based observations, sub-optimal number steps required, decide actions taken agent. propose ""active neural localizer"", fully differentiable neural network learns localize accurately efficiently. proposed model incorporates ideas traditional filtering-based localization methods, using structured belief state multiplicative interactions propagate belief, combines policy model localize accurately minimizing number steps required localization. active neural localizer trained end-to-end reinforcement learning. use variety simulation environments experiments include random 2d mazes, random mazes doom game engine photo-realistic environment unreal game engine. results 2d environments show effectiveness learned policy idealistic setting results 3d environments demonstrate model's capability learning policy perceptual model jointly raw-pixel based rgb observations. also show model trained random textures doom environment generalizes well photo-realistic office space environment unreal engine.",4 "adversarial discriminative domain adaptation. adversarial learning methods promising approach training robust deep networks, generate complex samples across diverse domains. also improve recognition despite presence domain shift dataset bias: several adversarial approaches unsupervised domain adaptation recently introduced, reduce difference training test domain distributions thus improve generalization performance. prior generative approaches show compelling visualizations, optimal discriminative tasks limited smaller shifts. prior discriminative approaches could handle larger domain shifts, imposed tied weights model exploit gan-based loss. first outline novel generalized framework adversarial adaptation, subsumes recent state-of-the-art approaches special cases, use generalized view better relate prior approaches. propose previously unexplored instance general framework combines discriminative modeling, untied weight sharing, gan loss, call adversarial discriminative domain adaptation (adda). show adda effective yet considerably simpler competing domain-adversarial methods, demonstrate promise approach exceeding state-of-the-art unsupervised adaptation results standard cross-domain digit classification tasks new difficult cross-modality object classification task.",4 "tensor sparse low-rank based submodule clustering method multi-way data. new submodule clustering method via sparse low-rank representation multi-way data proposed paper. instead reshaping multi-way data vectors, method maintains natural orders preserve data intrinsic structures, e.g., image data kept matrices. implement clustering, multi-way data, viewed tensors, represented proposed tensor sparse low-rank model obtain submodule representation, called free module, finally used spectral clustering. proposed method extends conventional subspace clustering method based sparse low-rank representation multi-way data submodule clustering combining t-product operator. new method tested several public datasets, including synthetical data, video sequences toy images. experiments show new method outperforms state-of-the-art methods, sparse subspace clustering (ssc), low-rank representation (lrr), ordered subspace clustering (osc), robust latent low rank representation (robustlatlrr) sparse submodule clustering method (ssmc).",4 "lifelong learning crf supervised aspect extraction. paper makes focused contribution supervised aspect extraction. shows system performed aspect extraction many past domains retained results knowledge, conditional random fields (crf) leverage knowledge lifelong learning manner extract new domain markedly better traditional crf without using prior knowledge. key innovation even crf training, model still improve extraction experiences applications.",4 "topic modeling public repositories scale using names source code. programming languages limited number reserved keywords character based tokens define language specification. however, programmers rich use natural language within code comments, text literals naming entities. programmer defined names found source code rich source information build high level understanding project. goal paper apply topic modeling names used 13.6 million repositories perceive inferred topics. one problems study occurrence duplicate repositories officially marked forks (obscure forks). show address using identifiers extracted topic modeling. open discussion naming source code, elaborate approach remove exact duplicate fuzzy duplicate repositories using locality sensitive hashing bag-of-words model discuss work topic modeling; finally present results data analysis together open-access source code, tools datasets.",4 "im2flow: motion hallucination static images action recognition. existing methods recognize actions static images take images face value, learning appearances---objects, scenes, body poses---that distinguish action class. however, models deprived rich dynamic structure motions also define human activity. propose approach hallucinates unobserved future motion implied single snapshot help static-image action recognition. key idea learn prior short-term dynamics thousands unlabeled videos, infer anticipated optical flow novel static images, train discriminative models exploit streams information. main contributions twofold. first, devise encoder-decoder convolutional neural network novel optical flow encoding translate static image accurate flow map. second, show power hallucinated flow recognition, successfully transferring learned motion standard two-stream network activity recognition. seven datasets, demonstrate power approach. achieves state-of-the-art accuracy dense optical flow prediction, also consistently enhances recognition actions dynamic scenes.",4 "automated playtesting procedural personas mcts evolved heuristics. paper describes method generative player modeling application automatic testing game content using archetypal player models called procedural personas. theoretically grounded psychological decision theory, procedural personas implemented using variation monte carlo tree search (mcts) node selection criteria developed using evolutionary computation, replacing standard ucb1 criterion mcts. using personas demonstrate generative player models applied varied corpus game levels demonstrate different play styles enacted level. short, use artificially intelligent personas construct synthetic playtesters. proposed approach could used tool automatic play testing human feedback readily available quick visualization potential interactions necessary. possible applications include interactive tools game development procedural content generation systems many evaluations must conducted within short time span.",4 "plausibility probability deductive reasoning. consider problem rational uncertainty unproven mathematical statements, g\""odel others remarked on. using bayesian-inspired arguments build normative model fair bets deductive uncertainty draws probability theory algorithms. comment connections zeilberger's notion ""semi-rigorous proofs"", particularly inherent subjectivity obstacle.",4 "robust fast decoding high-capacity color qr codes mobile applications. use color qr codes brings extra data capacity, also inflicts tremendous challenges decoding process due chromatic distortion, cross-channel color interference illumination variation. particularly, discover new type chromatic distortion high-density color qr codes, cross-module color interference, caused high density also makes geometric distortion correction challenging. address problems, propose two approaches, namely, lsvm-cmi qda-cmi, jointly model different types chromatic distortion. extended svm qda, respectively, lsvm-cmi qda-cmi optimize particular objective function learn color classifier. furthermore, robust geometric transformation method several pipeline refinements proposed boost decoding performance mobile applications. put forth implement framework high-capacity color qr codes equipped methods, called hiq. evaluate performance hiq, collect challenging large-scale color qr code dataset, cuhk-cqrc, consists 5390 high-density color qr code samples. comparison baseline method [2] cuhk-cqrc shows hiq least outperforms [2] 188% decoding success rate 60% bit error rate. implementation hiq ios android also demonstrates effectiveness framework real-world applications.",4 "hebbian/anti-hebbian neural network linear subspace learning: derivation multidimensional scaling streaming data. neural network models early sensory processing typically reduce dimensionality streaming input data. networks learn principal subspace, sense principal component analysis (pca), adjusting synaptic weights according activity-dependent learning rules. derived principled cost function rules nonlocal hence biologically implausible. time, biologically plausible local rules postulated rather derived principled cost function. here, bridge gap, derive biologically plausible network subspace learning streaming data minimizing principled cost function. departure previous work, cost quantified representation, reconstruction, error, adopt multidimensional scaling (mds) cost function streaming data. resulting algorithm relies biologically plausible hebbian anti-hebbian local learning rules. stochastic setting, synaptic weights converge stationary state projects input data onto principal subspace. data generated nonstationary distribution, network track principal subspace. thus, result makes step towards algorithmic theory neural computation.",16 "cuckoo search: recent advances applications. cuckoo search (cs) relatively new algorithm, developed yang deb 2009, cs efficient solving global optimization problems. paper, review fundamental ideas cuckoo search latest developments well applications. analyze algorithm gain insight search mechanisms find efficient. also discuss essence algorithms link self-organizing systems, finally propose important topics research.",12 "word sense disambiguation via high order learning complex networks. complex networks employed model many real systems modeling tool myriad applications. paper, use framework complex networks problem supervised classification word disambiguation task, consists deriving function supervised (or labeled) training data ambiguous words. traditional supervised data classification takes account topological physical features input data. hand, human (animal) brain performs low- high-level orders learning facility identify patterns according semantic meaning input data. paper, apply hybrid technique encompasses types learning field word sense disambiguation show high-level order learning really improve accuracy rate model. evidence serves demonstrate internal structures formed words present patterns that, generally, cannot correctly unveiled traditional techniques. finally, exhibit behavior model different weights low- high-level classifiers plotting decision boundaries. study helps one better understand effectiveness model.",15 "training evaluating multimodal word embeddings large-scale web annotated images. paper, focus training evaluating effective word embeddings text visual information. specifically, introduce large-scale dataset 300 million sentences describing 40 million images crawled downloaded publicly available pins (i.e. image sentence descriptions uploaded users) pinterest. dataset 200 times larger ms coco, standard large-scale image dataset sentence descriptions. addition, construct evaluation dataset directly assess effectiveness word embeddings terms finding semantically similar related words phrases. word/phrase pairs evaluation dataset collected click data millions users image search system, thus contain rich semantic relationships. based datasets, propose compare several recurrent neural networks (rnns) based multimodal (text image) models. experiments show model benefits incorporating visual information word embeddings, weight sharing strategy crucial learning multimodal embeddings. project page is: http://www.stat.ucla.edu/~junhua.mao/multimodal_embedding.html",4 "learning rgb-d salient object detection using background enclosure, depth contrast, top-down features. recently, deep convolutional neural networks (cnn) demonstrated strong performance rgb salient object detection. although, depth information help improve detection results, exploration cnns rgb-d salient object detection remains limited. propose novel deep cnn architecture rgb-d salient object detection exploits high-level, mid-level, low level features. further, present novel depth features capture ideas background enclosure depth contrast suitable learned approach. show improved results compared state-of-the-art rgb-d salient object detection methods. also show low-level mid-level depth features contribute improvements results. especially, f-score method 0.848 rgbd1000 dataset, 10.7% better second place.",4 "learning non-lambertian object intrinsics across shapenet categories. consider non-lambertian object intrinsic problem recovering diffuse albedo, shading, specular highlights single image object. build large-scale object intrinsics database based existing 3d models shapenet database. rendered realistic environment maps, millions synthetic images objects corresponding albedo, shading, specular ground-truth images used train encoder-decoder cnn. trained, network decompose image product albedo shading components, along additive specular component. cnn delivers accurate sharp results classical inverse problem computer vision, sharp details attributed skip layer connections corresponding resolutions encoder decoder. benchmarked shapenet mit intrinsics datasets, model consistently outperforms state-of-the-art large margin. train test cnn different object categories. perhaps surprising especially cnn classification perspective, intrinsics cnn generalizes well across categories. analysis shows feature learning encoder stage crucial developing universal representation across categories. apply synthetic data trained model images videos downloaded internet, observe robust realistic intrinsics results. quality non-lambertian intrinsics could open many interesting applications image-based albedo specular editing.",4 "hitting times local global optima genetic algorithms high selection pressure. paper devoted upper bounds expected first hitting times sets local global optima non-elitist genetic algorithms high selection pressure. results paper extend range situations upper bounds expected runtime known genetic algorithms apply, particular, canonical genetic algorithm. obtained bounds require probability fitness-decreasing mutation bounded constant less one.",4 "stable segmentation digital image. paper optimal image segmentation means piecewise constant approximations considered. optimality defined minimum value total squared error equivalent value standard deviation approximation image. optimal approximations defined independently method obtaining might generated different algorithms. investigate computation optimal approximation grounds stability respect given set modifications. obtain optimal approximation mumford-shuh model generalized developed, computational part combined otsu method multi-thresholding version. proposed solution proved analytically experimentally example standard image.",4 "visual-inertial-semantic scene representation 3-d object detection. describe system detect objects three-dimensional space using video inertial sensors (accelerometer gyrometer), ubiquitous modern mobile platforms phones drones. inertials afford ability impose class-specific scale priors objects, provide global orientation reference. minimal sufficient representation, posterior semantic (identity) syntactic (pose) attributes objects space, decomposed geometric term, maintained localization-and-mapping filter, likelihood function, approximated discriminatively-trained convolutional neural network. resulting system process video stream causally real time, provides representation objects scene persistent: confidence presence objects grows evidence, objects previously seen kept memory even temporarily occluded, return view automatically predicted prime re-detection.",4 "evolved policy gradients. propose meta-learning approach learning gradient-based reinforcement learning (rl) algorithms. idea evolve differentiable loss function, agent, optimizes policy minimize loss, achieve high rewards. loss parametrized via temporal convolutions agent's experience. loss highly flexible ability take account agent's history, enables fast task learning eliminates need reward shaping test time. empirical results show evolved policy gradient algorithm achieves faster learning several randomized environments compared off-the-shelf policy gradient method. moreover, test time, learner optimizes learned loss function, requires explicit reward signal. effect, agent internalizes reward structure, suggesting direction toward agents learn solve new tasks simply intrinsic motivation.",4 "novel variational model image registration using gaussian curvature. image registration one important task many image processing applications. aims align two images useful information extracted comparison, combination superposition. achieved constructing optimal trans- formation ensures template image becomes similar given reference image. although many models exist, designing model capable modelling large smooth deformation field continues pose challenge. paper proposes novel variational model image registration using gaussian curvature regulariser. model motivated surface restoration work geometric processing [elsey esedoglu, multiscale model. simul., (2009), pp. 1549-1573]. effective numerical solver provided model using augmented lagrangian method. numerical experiments show new model outperforms three competing models based on, respectively, linear curvature [fischer modersitzki, j. math. imaging vis., (2003), pp. 81- 85], mean curvature [chumchob, chen brito, multiscale model. simul., (2011), pp. 89-128] diffeomorphic demon model [vercauteren al., neuroimage, (2009), pp. 61-72] terms robustness accuracy.",12 "regret analysis continuous dueling bandit. dueling bandit learning framework wherein feedback information learning process restricted noisy comparison pair actions. research, address dueling bandit problem based cost function continuous space. propose stochastic mirror descent algorithm show algorithm achieves $o(\sqrt{t\log t})$-regret bound strong convexity smoothness assumptions cost function. subsequently, clarify equivalence regret minimization dueling bandit convex optimization cost function. moreover, considering lower bound convex optimization, algorithm shown achieve optimal convergence rate convex optimization optimal regret dueling bandit except logarithmic factor.",19 "genegan: learning object transfiguration attribute subspace unpaired data. object transfiguration replaces object image another object second image. example perform tasks like ""putting exactly eyeglasses image nose person image b"". usage exemplar images allows precise specification desired modifications improves diversity conditional image generation. however, previous methods rely feature space operations, require paired data and/or appearance models training disentangling objects background. work, propose model learn object transfiguration two unpaired sets images: one set containing images ""have"" kind object, set opposite, mild constraint objects located approximately place. example, training data one set reference face images eyeglasses, another set images not, spatially aligned face landmarks. despite weak 0/1 labels, model learn ""eyeglasses"" subspace contain multiple representatives different types glasses. consequently, perform fine-grained control generated images, like swapping glasses two images swapping projected components ""eyeglasses"" subspace, create novel images people wearing eyeglasses. overall, deterministic generative model learns disentangled attribute subspaces weakly labeled data adversarial training. experiments celeba multi-pie datasets validate effectiveness proposed model real world data, generating images specified eyeglasses, smiling, hair styles, lighting conditions etc. code available online.",4 "interactive data integration smart copy & paste. many scenarios, emergency response ad hoc collaboration, critical reduce overhead integrating data. ideally, one could perform entire process interactively one unified interface: defining extractors wrappers sources, creating mediated schema, adding schema mappings ? seeing impact integrated view data, refining design accordingly. propose novel smart copy paste (scp) model architecture seamlessly combining design-time run-time aspects data integration, describe initial prototype, copycat system. copycat, user need special tools different stages integration: instead, system watches user copies data applications (including web browser) pastes copycat?s spreadsheet-like workspace. copycat generalizes actions presents proposed auto-completions, explanation form provenance. user provides feedback suggestions ? either direct interactions copy-and-paste operations ? system learns feedback. paper provides overview prototype system, identifies key research challenges achieving scp full generality.",4 "entity-aware language model unsupervised reranker. language modeling, difficult incorporate entity relationships knowledge-base. one solution use reranker trained global features, global features derived n-best lists. however, training reranker requires manually annotated n-best lists, expensive obtain. propose method based contrastive estimation method~\cite{smith2005contrastive} alleviates need data. experiments music domain demonstrate global features, well features extracted external knowledge-base, incorporated reranker. final model achieves 0.44 absolute word error rate improvement blind test data.",4 "kernel alignment inspired linear discriminant analysis. kernel alignment measures degree similarity two kernels. paper, inspired kernel alignment, propose new linear discriminant analysis (lda) formulation, kernel alignment lda (kalda). first define two kernels, data kernel class indicator kernel. problem find subspace maximize alignment subspace-transformed data kernel class indicator kernel. surprisingly, kernel alignment induced kalda objective function similar classical lda expressed using between-class total scatter matrices. extended multi-label data. use stiefel-manifold gradient descent algorithm solve problem. perform experiments 8 single-label 6 multi-label data sets. results show kalda good performance many single-label multi-label problems.",4 "new characterizations minimum spanning trees saliency maps based quasi-flat zones. study three representations hierarchies partitions: dendrograms (direct representations), saliency maps, minimum spanning trees. provide new bijection saliency maps hierarchies based quasi-flat zones used image processing characterize saliency maps minimum spanning trees solutions constrained minimization problems constraint quasi-flat zones preservation. practice, results form toolkit new hierarchical methods one choose convenient representation. also invite us process non-image data morphological hierarchies.",4 "mining heterogeneous multivariate time-series learning meaningful patterns: application home health telecare. last years, time-series mining become challenging issue researchers. important application lies monitoring purposes, require analyzing large sets time-series learning usual patterns. deviation learned profile considered unexpected situation. moreover, complex applications may involve temporal study several heterogeneous parameters. paper, propose method mining heterogeneous multivariate time-series learning meaningful patterns. proposed approach allows mixed time-series -- containing pattern non-pattern data -- imprecise matches, outliers, stretching global translating patterns instances time. present early results approach context monitoring health status person home. purpose build behavioral profile person analyzing time variations several quantitative qualitative parameters recorded provision sensors installed home.",4 "hats: histograms averaged time surfaces robust event-based object classification. event-based cameras recently drawn attention computer vision community thanks advantages terms high temporal resolution, low power consumption high dynamic range, compared traditional frame-based cameras. properties make event-based cameras ideal choice autonomous vehicles, robot navigation uav vision, among others. however, accuracy event-based object classification algorithms, crucial importance reliable system working real-world conditions, still far behind frame-based counterparts. two main reasons performance gap are: 1. lack effective low-level representations architectures event-based object classification 2. absence large real-world event-based datasets. paper address problems. first, introduce novel event-based feature representation together new machine learning architecture. compared previous approaches, use local memory units efficiently leverage past temporal information build robust event-based representation. second, release first large real-world event-based dataset object classification. compare method state-of-the-art extensive experiments, showing better classification performance real-time computation.",4 "tractability theory patching. paper consider problem `theory patching', given domain theory, whose components indicated possibly flawed, set labeled training examples domain concept. theory patching problem revise indicated components theory, resulting theory correctly classifies training examples. theory patching thus type theory revision revisions made individual components theory. concern paper determine classes logical domain theories theory patching problem tractable. consider propositional first-order domain theories, show theory patching problem equivalent determining information contained theory `stable' regardless revisions might performed theory. show determining stability tractable input theory satisfies two conditions: revisions theory component monotonic effects classification examples, theory components act independently classification examples theory. also show concepts introduced used determine soundness completeness particular theory patching algorithms.",4 "early fire detection using hep space-time analysis. article, video base early fire alarm system developed monitoring smoke scene. two major contributions work. first, find best texture feature smoke detection, general framework, named histograms equivalent patterns (hep), adopted achieve extensive evaluation various kinds texture features. second, \emph{block based inter-frame difference} (bifd) improved version lbp-top proposed ensembled describe space-time characteristics smoke. order reduce false alarms, smoke history image (shi) utilized register recent classification results candidate smoke blocks. experimental results using svm show proposed method achieve better accuracy less false alarm compared state-of-the-art technologies.",4 "vain: attentional multi-agent predictive modeling. multi-agent predictive modeling essential step understanding physical, social team-play systems. recently, interaction networks (ins) proposed task modeling multi-agent physical systems, ins scale number interactions system (typically quadratic higher order number agents). paper introduce vain, novel attentional architecture multi-agent predictive modeling scales linearly number agents. show vain effective multi-agent predictive modeling. method evaluated tasks challenging multi-agent prediction domains: chess soccer, outperforms competing multi-agent approaches.",4 "pid parameters optimization using genetic algorithm. time delays components make time-lag systems response. arise physical, chemical, biological economic systems, well process measurement computation. work, implement genetic algorithm (ga) determining pid controller parameters compensate delay first order lag plus time delay (folpd) compare results iterative method ziegler-nichols rule results.",4 "attend, infer, repeat: fast scene understanding generative models. present framework efficient inference structured image models explicitly reason objects. achieve performing probabilistic inference using recurrent neural network attends scene elements processes one time. crucially, model learns choose appropriate number inference steps. use scheme learn perform inference partially specified 2d models (variable-sized variational auto-encoders) fully specified 3d models (probabilistic renderers). show models learn identify multiple objects - counting, locating classifying elements scene - without supervision, e.g., decomposing 3d images various numbers objects single forward pass neural network. show networks produce accurate inferences compared supervised counterparts, structure leads improved generalization.",4 "towards improving validation, verification, crash investigations, event reconstruction flight-critical systems self-forensics. paper introduces novel concept self-forensics complement standard autonomic self-chop properties self-managed systems, specified forensic lucid language. argue self-forensics, forensics taken cybercrime domain, applicable ""self-dissection"" purpose verification autonomous software hardware systems flight-critical systems automated incident anomaly analysis event reconstruction engineering teams variety incident scenarios design testing well actual flight data.",4 "thinking, learning, autonomous problem solving. ever increasing computational power require methods automatic programming. present alternative genetic programming, based general model thinking learning. advantage evolution takes place space constructs thus exploit mathematical structures space. model formalized, macro language presented allows formal yet intuitive description problem consideration. prototype developed implement scheme perl. method lead concentration analysis problems, rapid prototyping, treatment new problem classes, investigation philosophical problems. see fields application nonlinear differential equations, pattern recognition, robotics, model building, animated pictures.",4 "joint inference multiple label types large networks. tackle problem inferring node labels partially labeled graph node graph multiple label types label type large number possible labels. primary example, focus paper, joint inference label types hometown, current city, employers, users connected social network. standard label propagation fails consider properties label types interactions them. proposed method, called edgeexplain, explicitly models these, still enabling scalable inference distributed message-passing architecture. billion-node subset facebook social network, edgeexplain significantly outperforms label propagation several label types, lifts 120% recall@1 60% recall@3.",4 "consistent vertex nomination schemes. given vertex interest network $g_1$, vertex nomination problem seeks find corresponding vertex interest (if exists) second network $g_2$. although vertex nomination problem related tasks attracted much attention machine learning literature, applications social biological networks, framework far confined comparatively small class network models, concept statistically consistent vertex nomination schemes shallowly explored. paper, extend vertex nomination problem general statistical model graphs. further, drawing inspiration long-established classification framework pattern recognition literature, provide definitions key notions bayes optimality consistency extended vertex nomination framework, including derivation bayes optimal vertex nomination scheme. addition, prove universally consistent vertex nomination schemes exist. illustrative examples provided throughout.",19 "text2action: generative adversarial synthesis language action. paper, propose generative model learns relationship language human action order generate human action sequence given sentence describing human behavior. proposed generative model generative adversarial network (gan), based sequence sequence (seq2seq) model. using proposed generative network, synthesize various actions robot virtual agent using text encoder recurrent neural network (rnn) action decoder rnn. proposed generative network trained 29,770 pairs actions sentence annotations extracted msr-video-to-text (msr-vtt), large-scale video dataset. demonstrate network generate human-like actions transferred baxter robot, robot performs action based provided sentence. results show proposed generative network correctly models relationship language action generate diverse set actions sentence.",4 "leveraging path signature skeleton-based human action recognition. human action recognition videos one challenging tasks computer vision. one important issue design discriminative features representing spatial context temporal dynamics. here, introduce path signature feature encode information intra-frame inter-frame contexts. key step towards leveraging feature construct proper trajectories (paths) data steam. frame, correlated constraints human joints treated small paths, spatial path signature features extracted them. video data, evolution spatial features time also regarded paths temporal path signature features extracted. eventually, features concatenated constitute input vector fully connected neural network action classification. experimental results four standard benchmark action datasets, j-hmdb, sbu dataset, berkeley mhad, nturgb+d demonstrate proposed approach achieves state-of-the-art accuracy even comparison recent deep learning based models.",4 "time series analysis via matrix estimation. consider task interpolating forecasting time series presence noise missing data. main contribution work, introduce algorithm transforms observed time series matrix, utilizes singular value thresholding simultaneously recover missing values de-noise observed entries, performs linear regression make predictions. argue method provides meaningful imputation forecasting large class models: finite sum harmonics (which approximate stationary processes), non-stationary sublinear trends, linear time-invariant (lti) systems, additive mixtures. general, algorithm recovers hidden state dynamics based noisy observations, like hidden markov model (hmm), provided dynamics obey stated models. demonstrate synthetic real-world datasets algorithm outperforms standard software packages presence significantly missing data high levels noise, also packages given underlying model algorithm remains oblivious. line finite sample analysis model classes.",4 "manifold matching using shortest-path distance joint neighborhood selection. matching datasets multiple modalities become important task data analysis. existing methods often rely embedding transformation single modality without utilizing correspondence information, often results sub-optimal matching performance. paper, propose nonlinear manifold matching algorithm using shortest-path distance joint neighborhood selection. specifically, joint nearest-neighbor graph built modalities. shortest-path distance within modality calculated joint neighborhood graph, followed embedding matching common low-dimensional euclidean space. compared existing algorithms, approach exhibits superior performance matching disparate datasets multiple modalities.",19 "representation learning visual-relational knowledge graphs. visual-relational knowledge graph (kg) multi-relational graph whose entities associated images. introduce imagegraph, kg 1,330 relation types, 14,870 entities, 829,931 images. visual-relational kgs lead novel probabilistic query types images treated first-class citizens. prediction relations unseen images multi-relational image retrieval formulated query types visual-relational kg. approach problem answering queries novel combination deep convolutional networks models learning knowledge graph embeddings. resulting models answer queries ""how two unseen images related other?"" also explore zero-shot learning scenario image entirely new entity linked multiple relations entities existing kg. multi-relational grounding unseen entity images knowledge graph serves description entity. conduct experiments demonstrate proposed deep architectures combination kg embedding objectives answer visual-relational queries efficiently accurately.",4 "ssh: single stage headless face detector. introduce single stage headless (ssh) face detector. unlike two stage proposal-classification detectors, ssh detects faces single stage directly early convolutional layers classification network. ssh headless. is, able achieve state-of-the-art results removing ""head"" underlying classification network -- i.e. fully connected layers vgg-16 contains large number parameters. additionally, instead relying image pyramid detect faces various scales, ssh scale-invariant design. simultaneously detect faces different scales single forward pass network, different layers. properties make ssh fast light-weight. surprisingly, headless vgg-16, ssh beats resnet-101-based state-of-the-art wider dataset. even though, unlike current state-of-the-art, ssh use image pyramid 5x faster. moreover, image pyramid deployed, light-weight network achieves state-of-the-art subsets wider dataset, improving ap 2.5%. ssh also reaches state-of-the-art results fddb pascal-faces datasets using small input size, leading runtime 50 ms/image gpu. code available https://github.com/mahyarnajibi/ssh.",4 "decentralized supply chain formation: market protocol competitive equilibrium analysis. supply chain formation process determining structure terms exchange relationships enable multilevel, multiagent production activity. present simple model supply chains, highlighting two characteristic features: hierarchical subtask decomposition, resource contention. decentralize formation process, introduce market price system resources produced along chain. competitive equilibrium system, agents choose locally optimal allocations respect prices, outcomes optimal overall. determine prices, define market protocol based distributed, progressive auctions, myopic, non-strategic agent bidding policies. presence resource contention, protocol produces better solutions greedy protocols common artificial intelligence multiagent systems literature. protocol often converges high-value supply chains, competitive equilibria exist, typically approximate competitive equilibria. however, complementarities agent production technologies cause protocol wastefully allocate inputs agents produce outputs. subsequent decommitment phase recovers significant fraction lost surplus.",4 "multi-objective contextual bandit problem similarity information. paper propose multi-objective contextual bandit problem similarity information. problem extends classical contextual bandit problem similarity information introducing multiple possibly conflicting objectives. since best arm objective different given context, learning best arm based single objective jeopardize rewards obtained objectives. order evaluate performance learner setup, use performance metric called contextual pareto regret. essentially, contextual pareto regret sum distances arms chosen learner context dependent pareto front. problem, develop new online learning algorithm called pareto contextual zooming (pcz), exploits idea contextual zooming learn arms close pareto front observed context adaptively partitioning joint context-arm set according observed rewards locations context-arm pairs selected past. then, prove pcz achieves $\tilde (t^{(1+d_p)/(2+d_p)})$ pareto regret $d_p$ pareto zooming dimension depends size set near-optimal context-arm pairs. moreover, show regret bound nearly optimal providing almost matching $\omega (t^{(1+d_p)/(2+d_p)})$ lower bound.",19 "algorithms closed rational behavior (curb) sets. provide series algorithms demonstrating solutions according fundamental game-theoretic solution concept closed rational behavior (curb) sets two-player, normal-form games computed polynomial time (we also discuss extensions n-player games). first, describe algorithm identifies player's best responses conditioned belief player play within given subset strategy space. algorithm serves subroutine series polynomial-time algorithms finding minimal curb sets, one minimal curb set, smallest minimal curb set game. show complexity finding nash equilibrium exponential size game's smallest curb set. related this, show smallest curb set arbitrarily small portion game, also arbitrarily larger supports enclosed nash equilibrium. test algorithms empirically find commonly studied academic games tend either large small minimal curb sets.",4 "convolutional kernel networks. important goal visual recognition devise image representations invariant particular transformations. paper, address goal new type convolutional neural network (cnn) whose invariance encoded reproducing kernel. unlike traditional approaches neural networks learned either represent data solving classification task, network learns approximate kernel feature map training data. approach enjoys several benefits classical ones. first, teaching cnns invariant, obtain simple network architectures achieve similar accuracy complex ones, easy train robust overfitting. second, bridge gap neural network literature kernels, natural tools model invariance. evaluate methodology visual recognition tasks cnns proven perform well, e.g., digit recognition mnist dataset, challenging cifar-10 stl-10 datasets, accuracy competitive state art.",4 text analysis tools spoken language processing. submission contains postscript final version slides used acl-94 tutorial.,2 "deep reinforcement learning raw pixels doom. using current reinforcement learning methods, recently become possible learn play unknown 3d games raw pixels. work, study challenges arise complex environments, summarize current methods approach these. choose task within doom game, approached yet. goal agent fight enemies 3d world consisting five rooms. train dqn lstm-a3c algorithms task. results show algorithms learn sensible policies, fail achieve high scores given amount training. provide insights learned behavior, serve valuable starting point research doom domain.",4 "spatial random sampling: structure-preserving data sketching tool. random column sampling guaranteed yield data sketches preserve underlying structures data may sample sufficiently less-populated data clusters. also, adaptive sampling often provide accurate low rank approximations, yet may fall short producing descriptive data sketches, especially cluster centers linearly dependent. motivated that, paper introduces novel randomized column sampling tool dubbed spatial random sampling (srs), data points sampled based proximity randomly sampled points unit sphere. compelling feature srs corresponding probability sampling given data cluster proportional surface area cluster occupies unit sphere, independently size cluster population. although fully randomized, srs shown provide descriptive balanced data representations. proposed idea addresses pressing need data science holds potential inspire many novel approaches analysis big data.",4 "extracting temporal causal relations events. structured information resulting temporal information processing crucial variety natural language processing tasks, instance generate timeline summarization events news documents, answer temporal/causal-related questions events. thesis present framework integrated temporal causal relation extraction system. first develop robust extraction component type relations, i.e. temporal order causality. combine two extraction components integrated relation extraction system, catena---causal temporal relation extraction natural language texts---, utilizing presumption event precedence causality, causing events must happened resulting events. several resources techniques improve relation extraction systems also discussed, including word embeddings training data expansion. finally, report adaptation efforts temporal information processing languages english, namely italian indonesian.",4 "verification generalized inconsistency-aware knowledge action bases (extended version). knowledge action bases (kabs) put forward semantically rich representation domain, using dl kb account static aspects, actions evolve extensional part time, possibly introducing new objects. recently, kabs extended manage inconsistency, ad-hoc verification techniques geared towards specific semantics. work provides twofold contribution along line research. one hand, enrich kabs high-level, compact action language inspired golog, obtaining called golog-kabs (gkabs). hand, introduce parametric execution semantics gkabs, elegantly accomodate plethora inconsistency-aware semantics based notion repair. provide several reductions verification sophisticated first-order temporal properties inconsistency-aware gkabs, show addressed using known techniques, developed standard kabs.",4 "by-passing kohn-sham equations machine learning. last year, least 30,000 scientific papers used kohn-sham scheme density functional theory solve electronic structure problems wide variety scientific fields, ranging materials science biochemistry astrophysics. machine learning holds promise learning kinetic energy functional via examples, by-passing need solve kohn-sham equations. yield substantial savings computer time, allowing either larger systems longer time-scales tackled, attempts machine-learn functional limited need find derivative. present work overcomes difficulty directly learning density-potential energy-density maps test systems various molecules. improved accuracy lower computational cost method demonstrated reproducing dft energies range molecular geometries generated molecular dynamics simulations. moreover, methodology could applied directly quantum chemical calculations, allowing construction density functionals quantum-chemical accuracy.",15 "reinforcement learning using quantum boltzmann machines. investigate whether quantum annealers select chip layouts outperform classical computers reinforcement learning tasks. associate transverse field ising spin hamiltonian layout qubits similar deep boltzmann machine (dbm) use simulated quantum annealing (sqa) numerically simulate quantum sampling system. design reinforcement learning algorithm set visible nodes representing states actions optimal policy first last layers deep network. absence transverse field, simulations show dbms train effectively restricted boltzmann machines (rbm) number weights. since sampling boltzmann distributions dbm classically feasible, evidence advantage non-turing sampling oracle. develop framework training network quantum boltzmann machine (qbm) presence significant transverse field reinforcement learning. improves reinforcement learning method using dbms.",18 "associative content-addressable networks exponentially many robust stable states. brain must robustly store large number memories, corresponding many events encountered lifetime. however, number memory states existing neural network models either grows weakly network size recall fails catastrophically vanishingly little noise. construct associative content-addressable memory exponentially many stable states robust error-correction. network possesses expander graph connectivity restricted boltzmann machine architecture. expansion property allows simple neural network dynamics perform par modern error-correcting codes. appropriate networks constructed sparse random connections, glomerular nodes, associative learning using low dynamic-range weights. thus, sparse quasi-random structures---characteristic important error-correcting codes---may provide high-performance computation artificial neural networks brain.",16 "scalability neural control musculoskeletal robots. anthropomimetic robots robots sense, behave, interact feel like humans. definition, anthropomimetic robots require human-like physical hardware actuation, also brain-like control sensing. self-evident realization meet requirements would human-like musculoskeletal robot brain-like neural controller. musculoskeletal robotic hardware neural control software existed decades, scalable approach could used build control anthropomimetic human-scale robot demonstrated yet. combining myorobotics, framework musculoskeletal robot development, spinnaker, neuromorphic computing platform, present proof-of-principle system scale dozens neurally-controlled, physically compliant joints. core, implements closed-loop cerebellar model provides real-time low-level neural control minimal power consumption maximal extensibility: higher-order (e.g., cortical) neural networks neuromorphic sensors like silicon-retinae -cochleae naturally incorporated.",4 "shading local shape. develop framework extracting concise representation shape information available diffuse shading small image patch. produces mid-level scene descriptor, comprised local shape distributions inferred separately every image patch across multiple scales. framework based quadratic representation local shape that, absence noise, guarantees recovering accurate local shape lighting. noise present, inferred local shape distributions provide useful shape information without over-committing particular image explanation. local shape distributions naturally encode fact smooth diffuse regions informative others, enable efficient robust reconstruction object-scale shape. experimental results show approach surface reconstruction compares well state-of-art synthetic images captured photographs.",4 "deep learning methods efficient large scale video labeling. present solution ""google cloud youtube-8m video understanding challenge"" ranked 5th place. proposed model ensemble three model families, two frame level one video level. training performed augmented dataset, cross validation.",19 "view adaptive recurrent neural networks high performance human action recognition skeleton data. skeleton-based human action recognition recently attracted increasing attention due popularity 3d skeleton data. one main challenge lies large view variations captured human actions. propose novel view adaptation scheme automatically regulate observation viewpoints occurrence action. rather re-positioning skeletons based human defined prior criterion, design view adaptive recurrent neural network (rnn) lstm architecture, enables network adapt suitable observation viewpoints end end. extensive experiment analyses show proposed view adaptive rnn model strives (1) transform skeletons various views much consistent viewpoints (2) maintain continuity action rather transforming every frame position body orientation. model achieves significant improvement state-of-the-art approaches three benchmark datasets.",4 "predictive-state decoders: encoding future recurrent networks. recurrent neural networks (rnns) vital modeling technique rely internal states learned indirectly optimization supervised, unsupervised, reinforcement training loss. rnns used model dynamic processes characterized underlying latent states whose form often unknown, precluding analytic representation inside rnn. predictive-state representation (psr) literature, latent state processes modeled internal state representation directly models distribution future observations, recent work area relied explicitly representing targeting sufficient statistics probability distribution. seek combine advantages rnns psrs augmenting existing state-of-the-art recurrent neural networks predictive-state decoders (psds), add supervision network's internal state representation target predicting future observations. predictive-state decoders simple implement easily incorporated existing training pipelines via additional loss regularization. demonstrate effectiveness psds experimental results three different domains: probabilistic filtering, imitation learning, reinforcement learning. each, method improves statistical performance state-of-the-art recurrent baselines fewer iterations less data.",19 "deep bcd-net using identical encoding-decoding cnn structures iterative image recovery. ""extreme"" computational imaging collects extremely undersampled noisy measurements, obtaining accurate image within reasonable computing time challenging. incorporating image mapping convolutional neural networks (cnn) iterative image recovery great potential resolve issue. paper 1) incorporates image mapping cnn using identical convolutional kernels encoders decoders block coordinate descent (bcd) optimization method -- referred bcd-net using identical encoding-decoding cnn structures -- 2) applies alternating direction method multipliers train proposed bcd-net. numerical experiments show that, a) denoising moderately low signal-to-noise-ratio images b) extremely undersampled magnetic resonance imaging, proposed bcd-net achieves (significantly) accurate image recovery, compared bcd-net using distinct encoding-decoding structures and/or conventional image recovery model using wavelets total variation.",19 "accurate localization dense urban area using google street view image. accurate information location orientation camera mobile devices central utilization location-based services (lbs). mobile devices rely gps data data subject inaccuracy due imperfections quality signal provided satellites. shortcoming spurred research improving accuracy localization. since mobile devices camera, major thrust research seeks acquire local scene apply image retrieval techniques querying gps-tagged image database find best match acquired scene.. techniques however computationally demanding unsuitable real-time applications assistive technology navigation blind visually impaired motivated work. overcome high complexity techniques, investigated use inertial sensors aid image-retrieval-based approach. armed information media images, data gps module along orientation sensors accelerometer gyro, sought limit size image set c search best match. specifically, data orientation sensors along dilution precision (dop) gps used find angle view estimation position. present analysis reduction image set size search well simulations demonstrate effectiveness fast implementation 98% estimated position error.",4 "unsupervised learning regression mixture models unknown number components. regression mixture models widely studied statistics, machine learning data analysis. fitting regression mixtures challenging usually performed maximum likelihood using expectation-maximization (em) algorithm. however, well-known initialization crucial em. initialization inappropriately performed, em algorithm may lead unsatisfactory results. em algorithm also requires number clusters given priori; problem selecting number mixture components requires using model selection criteria choose one set pre-estimated candidate models. propose new fully unsupervised algorithm learn regression mixture models unknown number components. developed unsupervised learning approach consists penalized maximum likelihood estimation carried robust expectation-maximization (em) algorithm fitting polynomial, spline b-spline regressions mixtures. proposed learning approach fully unsupervised: 1) simultaneously infers model parameters optimal number regression mixture components data learning proceeds, rather two-fold scheme standard model-based clustering using afterward model selection criteria, 2) require accurate initialization unlike standard em regression mixtures. developed approach applied curve clustering problems. numerical experiments simulated data show proposed robust em algorithm performs well provides accurate results terms robustness regard initialization retrieving optimal partition actual number clusters. application real data framework functional data clustering, confirms benefit proposed approach practical applications.",19 "conversion artificial recurrent neural networks spiking neural networks low-power neuromorphic hardware. recent years field neuromorphic low-power systems consume orders magnitude less power gained significant momentum. however, wider use still hindered lack algorithms harness strengths architectures. neuromorphic adaptations representation learning algorithms emerging, efficient processing temporal sequences variable length-inputs remain difficult. recurrent neural networks (rnn) widely used machine learning solve variety sequence learning tasks. work present train-and-constrain methodology enables mapping machine learned (elman) rnns substrate spiking neurons, compatible capabilities current near-future neuromorphic systems. ""train-and-constrain"" method consists first training rnns using backpropagation time, discretizing weights finally converting spiking rnns matching responses artificial neurons spiking neurons. demonstrate approach mapping natural language processing task (question classification), demonstrate entire mapping process recurrent layer network ibm's neurosynaptic system ""truenorth"", spike-based digital neuromorphic hardware architecture. truenorth imposes specific constraints connectivity, neural synaptic parameters. satisfy constraints, necessary discretize synaptic weights neural activities 16 levels, limit fan-in 64 inputs. find short synaptic delays sufficient implement dynamical (temporal) aspect rnn question classification task. hardware-constrained model achieved 74% accuracy question classification using less 0.025% cores one truenorth chip, resulting estimated power consumption ~17 uw.",4 "network structure dynamics, emergence robustness stabilizing selection artificial genome. genetic regulation key component development, clear understanding structure dynamics genetic networks yet hand. work investigate properties within artificial genome model originally introduced reil. analyze statistical properties randomly generated genomes sequence- network level, show model correctly predicts frequency genes genomes found experimental data. using evolutionary algorithm based stabilizing selection phenotype, show robustness single base mutations, well random changes initial network states mimic stochastic fluctuations environmental conditions, emerge parallel. evolved genomes exhibit characteristic patterns sequence network level.",16 "automatic network reconstruction using asp. building biological models inferring functional dependencies experimental data im- portant issue molecular biology. relieve biologist traditionally manual process, various approaches proposed increase degree automation. however, available ap- proaches often yield single model only, rely specific assumptions, and/or use dedicated, heuris- tic algorithms intolerant changing circumstances requirements view rapid progress made biotechnology. aim provide declarative solution problem ap- peal answer set programming (asp) overcoming difficulties. build upon existing approach automatic network reconstruction proposed part authors. approach firm mathematical foundations well suited asp due combinatorial flavor providing characterization models explaining set experiments. usage asp several ben- efits existing heuristic algorithms. first, declarative thus transparent biological experts. second, elaboration tolerant thus allows easy exploration incorporation biological constraints. third, allows exploring entire space possible models. finally, approach offers excellent performance, matching existing, special-purpose systems.",4 "robustly learning gaussian: getting optimal error, efficiently. study fundamental problem learning parameters high-dimensional gaussian presence noise -- $\varepsilon$-fraction samples chosen adversary. give robust estimators achieve estimation error $o(\varepsilon)$ total variation distance, optimal universal constant independent dimension. case mean unknown, robustness guarantee optimal factor $\sqrt{2}$ running time polynomial $d$ $1/\epsilon$. mean covariance unknown, running time polynomial $d$ quasipolynomial $1/\varepsilon$. moreover algorithms require polynomial number samples. work shows sorts error guarantees established fifty years ago one-dimensional setting also achieved efficient algorithms high-dimensional settings.",4 "verifiability argumentation semantics. dung's abstract argumentation theory widely used formalism model conflicting information draw conclusions situations. hereby, knowledge represented so-called argumentation frameworks (afs) reasoning done via semantics extracting acceptable sets. reasonable semantics based notion conflict-freeness means arguments jointly acceptable linked within af. paper, study question information top conflict-free sets needed compute extensions semantics hand. introduce hierarchy so-called verification classes specifying required amount information. show well-known standard semantics exactly verifiable certain class. framework also gives means study semantics lying inbetween known semantics, thus contributing abstract understanding different features argumentation semantics offer.",4 "fastderain: novel video rain streak removal method using directional gradient priors. rain streak removal important issue outdoor vision systems recently investigated extensively. paper, propose novel video rain streak removal approach fastderain, fully considers discriminative characteristics rain streaks clean video gradient domain. specifically, one hand, rain streaks sparse smooth along direction raindrops, whereas hand, clean videos exhibit piecewise smoothness along rain-perpendicular direction continuity along temporal direction. theses smoothness continuity results sparse distribution different directional gradient domain, respectively. thus, minimize 1) $\ell_1$ norm enhance sparsity underlying rain streaks, 2) two $\ell_1$ norm unidirectional total variation (tv) regularizers guarantee anisotropic spatial smoothness, 3) $\ell_1$ norm time-directional difference operator characterize temporal continuity. split augmented lagrangian shrinkage algorithm (salsa) based algorithm designed solve proposed minimization model. experiments conducted synthetic real data demonstrate effectiveness efficiency proposed method. according comprehensive quantitative performance measures, approach outperforms state-of-the-art methods especially account running time.",4 "consistent kernel mean estimation functions random variables. provide theoretical foundation non-parametric estimation functions random variables using kernel mean embeddings. show continuous function $f$, consistent estimators mean embedding random variable $x$ lead consistent estimators mean embedding $f(x)$. mat\'ern kernels sufficiently smooth functions also provide rates convergence. results extend functions multiple random variables. variables dependent, require estimator mean embedding joint distribution starting point; independent, sufficient separate estimators mean embeddings marginal distributions. either case, results cover mean embeddings based i.i.d. samples well ""reduced set"" expansions terms dependent expansion points. latter serves justification using expansions limit memory resources applying approach basis probabilistic programming.",19 "heinrich behmann's contributions second-order quantifier elimination view computational logic. relational monadic formulas (the l\""owenheim class) second-order quantifier elimination, closely related computation uniform interpolants, projection forgetting - operations currently receive much attention knowledge processing - always succeeds. decidability proof class heinrich behmann 1922 explicitly proceeds elimination equivalence preserving formula rewriting. reconstruct results behmann's publication detail discuss related issues relevant context modern approaches second-order quantifier elimination computational logic. addition, extensive documentation letters manuscripts behmann's bequest concern second-order quantifier elimination given, including commented register english abstracts german sources focus technical material. late 1920s behmann attempted develop elimination-based decision method formulas predicates whose arity larger one. manuscripts correspondence wilhelm ackermann show technical aspects still interest today give insight genesis ackermann's landmark paper ""untersuchungen \""uber das eliminationsproblem der mathematischen logik"" 1935, laid foundation two prevailing modern approaches second-order quantifier elimination.",4 "intrusion detection smartphones. smartphone technology becoming predominant communication tool people across world. people use smartphones keep contact data, browse internet, exchange messages, keep notes, carry personal files documents, etc. users browsing also capable shopping online, thus provoking need type credit card numbers security codes. smartphones becoming widespread security threats vulnerabilities facing technology. recent news articles indicate huge increase malware viruses operating systems employed smartphones (primarily android ios). major limitations smartphone technology processing power scarce energy source since smartphones rely battery usage. since smartphones devices change network location user moves different places, intrusion detection systems smartphone technology often classified idss designed mobile ad-hoc networks. aim research give brief overview ids technology, give overview major machine learning pattern recognition algorithms used ids technologies, give overview security models ios android propose new host-based ids model smartphones create proof-of-concept application android platform newly proposed model. keywords: ids, svm, android, ios;",4 "probabilistic interpretation linear solvers. manuscript proposes probabilistic framework algorithms iteratively solve unconstrained linear problems $bx = b$ positive definite $b$ $x$. goal replace point estimates returned existing methods gaussian posterior belief elements inverse $b$, used estimate errors. recent probabilistic interpretations secant family quasi-newton optimization algorithms extended. combined properties conjugate gradient algorithm, leads uncertainty-calibrated methods limited cost overhead conjugate gradients, self-contained novel interpretation quasi-newton conjugate gradient algorithms, foundation new nonlinear optimization methods.",12 "computational estimate visualisation evaluation agent classified rules learning system. student modelling agent classified rules learning applied development intelligent preassessment system presented [10],[11]. paper, demystify theory behind development pre-assessment system followed computational experimentation graph visualisation agent classified rules learning algorithm estimation prediction classified rules. addition, present preliminary results pre-assessment system evaluation. results, gathered system performed according design specification.",4 "general methodology determination 2d bodies elastic deformation invariants. application automatic identification parasites. novel methodology introduced exploits 2d images arbitrary elastic body deformation instances, quantify mechano-elastic characteristics deformation invariant. determination characteristics allows developing methods offering image undeformed body. general assumptions mechano-elastic properties bodies stated, lead two different approaches obtaining bodies' deformation invariants. one developed spot deformed body's neutral line cross sections, solves deformation pdes performing set equivalent image operations deformed body images. processes may furnish body undeformed version deformed image. confirmed obtaining undeformed shape deformed parasites, cells (protozoa), fibers human lips. addition, method applied important problem parasite automatic classification microscopic images. achieve this, first apply previous method straighten highly deformed parasites apply dedicated curve classification method straightened parasite contours. demonstrated essentially different deformations parasite give rise practically undeformed shape, thus confirming consistency introduced methodology. finally, developed pattern recognition method classifies unwrapped parasites 6 families, accuracy rate 97.6 %.",4 disentangled representations manipulation sentiment text. ability change arbitrary aspects text leaving core message intact could strong impact fields like marketing politics enabling e.g. automatic optimization message impact personalized language adapted receiver's profile. paper take first step towards system presenting algorithm manipulate sentiment text preserving semantics using disentangled representations. validation performed examining trajectories embedding space analyzing transformed sentences semantic preservation expression desired sentiment shift.,4 "in-bed pose estimation: deep learning shallow dataset. although human pose estimation various computer vision (cv) applications studied extensively last decades, yet in-bed pose estimation using camera-based vision methods ignored cv community assumed identical general purpose pose estimation methods. however, in-bed pose estimation specialized aspects comes specific challenges including notable differences lighting conditions throughout day also different pose distribution common human surveillance viewpoint. paper, demonstrate challenges significantly lessen effectiveness existing general purpose pose estimation models. order address lighting variation challenge, infrared selective (irs) image acquisition technique proposed provide uniform quality data various lighting conditions. deep learning framework proves effective model human pose estimation, however lack large public dataset in-bed poses prevents us using large network scratch. work, explored idea employing pre-trained convolutional neural network (cnn) model trained large public datasets general human poses fine-tuning model using shallow (limited size different perspective color) in-bed irs dataset. developed irs imaging system collected irs image data several realistic life-size mannequins simulated hospital room environment. pre-trained cnn called convolutional pose machine (cpm) repurposed in-bed pose estimation fine-tuning specific intermediate layers. using hog rectification method, pose estimation performance cpm significantly improved 26.4% pck0.1 criteria compared model without rectification.",4 "label efficient learning exploiting multi-class output codes. present new perspective popular multi-class algorithmic techniques one-vs-all error correcting output codes. rather studying behavior techniques supervised learning, establish connection success methods existence label-efficient learning procedures. show realizable agnostic cases, output codes successful learning labeled data, implicitly assume structure classes related. making structure explicit, design learning algorithms recover classes low label complexity. provide results commonly studied cases one-vs-all learning codewords classes well separated. additionally consider challenging case codewords well separated, satisfy boundary features condition captures natural intuition every bit codewords significant.",4 "optimal learning sequential decision making expensive cost functions stochastic binary feedbacks. consider problem sequentially making decisions rewarded ""successes"" ""failures"" predicted unknown relationship depends partially controllable vector attributes instance. learner takes active role selecting samples instance pool. goal maximize probability success either offline (training) online (testing) phases. problem motivated real-world applications observations time-consuming and/or expensive. develop knowledge gradient policy using online bayesian linear classifier guide experiment maximizing expected value information labeling alternative. provide finite-time analysis estimated error show maximum likelihood estimator based produced kg policy consistent asymptotically normal. also show knowledge gradient policy asymptotically optimal offline setting. work extends knowledge gradient setting contextual bandits. report results series experiments demonstrate efficiency.",19 "distant ie bootstrapping using lists document structure. distant labeling information extraction (ie) suffers noisy training data. describe way reducing noise associated distant ie identifying coupling constraints potential instance labels. one example coupling, items list likely label. second example coupling comes analysis document structure: corpora, sections identified items section likely label. sections exist corpora, show augmenting large corpus coupling constraints even small, well-structured corpus improve performance substantially, doubling f1 one task.",4 "coherent online video style transfer. training feed-forward network fast neural style transfer images proven successful. however, naive extension process video frame frame prone producing flickering results. propose first end-to-end network online video style transfer, generates temporally coherent stylized video sequences near real-time. two key ideas include efficient network incorporating short-term coherence, propagating short-term coherence long-term, ensures consistency larger period time. network incorporate different image stylization networks. show proposed method clearly outperforms per-frame baseline qualitatively quantitatively. moreover, achieve visually comparable coherence optimization-based video style transfer, three orders magnitudes faster runtime.",4 "uavs using bayesian optimization locate wifi devices. address problem localizing non-collaborative wifi devices large region. main motive localize humans localizing wifi devices, e.g. search-and-rescue operations natural disaster. use active sensing approach relies unmanned aerial vehicles (uavs) collect signal-strength measurements informative locations. problem challenging since measurement received arbitrary times received uav close proximity device. reasons, extremely important make prudent decision measurements. use bayesian optimization approach based gaussian process (gp) regression. approach works well application since gps give reliable predictions measurements bayesian optimization makes judicious trade-off exploration exploitation. field experiments conducted region 1000 $\times$ 1000 $m^2$, show approach reduces search area less 100 meters around wifi device within 5 minutes only. overall, approach localizes device less 15 minutes error less 20 meters.",4 "robust text detection natural scene images. text detection natural scene images important prerequisite many content-based image analysis tasks. paper, propose accurate robust method detecting texts natural scene images. fast effective pruning algorithm designed extract maximally stable extremal regions (msers) character candidates using strategy minimizing regularized variations. character candidates grouped text candidates ingle-link clustering algorithm, distance weights threshold clustering algorithm learned automatically novel self-training distance metric learning algorithm. posterior probabilities text candidates corresponding non-text estimated character classifier; text candidates high probabilities eliminated finally texts identified text classifier. proposed system evaluated icdar 2011 robust reading competition dataset; f measure 76% significantly better state-of-the-art performance 71%. experimental results publicly available multilingual dataset also show proposed method outperform competitive method f measure increase 9 percent. finally, setup online demo proposed scene text detection system http://kems.ustb.edu.cn/learning/yin/dtext.",4 "orthogonal idempotent transformations learning deep neural networks. identity transformations, used skip-connections residual networks, directly connect convolutional layers close input close output deep neural networks, improving information flow thus easing training. paper, introduce two alternative linear transforms, orthogonal transformation idempotent transformation. according definition property orthogonal idempotent matrices, product multiple orthogonal (same idempotent) matrices, used form linear transformations, equal single orthogonal (idempotent) matrix, resulting information flow improved training eased. one interesting point success essentially stems feature reuse gradient reuse forward backward propagation maintaining information flow eliminating gradient vanishing problem express way skip-connections. empirically demonstrate effectiveness proposed two transformations: similar performance single-branch networks even superior multi-branch networks comparison identity transformations.",4 "pixelnn: example-based image synthesis. present simple nearest-neighbor (nn) approach synthesizes high-frequency photorealistic images ""incomplete"" signal low-resolution image, surface normal map, edges. current state-of-the-art deep generative models designed conditional image synthesis lack two important things: (1) unable generate large set diverse outputs, due mode collapse problem. (2) interpretable, making difficult control synthesized output. demonstrate nn approaches potentially address limitations, suffer accuracy small datasets. design simple pipeline combines best worlds: first stage uses convolutional neural network (cnn) maps input (overly-smoothed) image, second stage uses pixel-wise nearest neighbor method map smoothed output multiple high-quality, high-frequency outputs controllable manner. demonstrate approach various input modalities, various domains ranging human faces cats-and-dogs shoes handbags.",4 "revisiting problem mobile robot map building: hierarchical bayesian approach. present application hierarchical bayesian estimation robot map building. revisiting problem occurs robot decide whether seeing previously-built portion map, exploring new territory. difficult decision problem, requiring probability outside current known map. estimate probability, model structure ""typical"" environment hidden markov model generates sequences views observed robot navigating environment. dirichlet prior structural models learned previously explored environments. whenever robot explores new environment, posterior model estimated dirichlet hyperparameters. approach implemented tested context multi-robot map merging, particularly difficult instance revisiting problem. experiments robot data show technique yields strong improvements alternative methods.",4 "reducing drift visual odometry inferring sun direction using bayesian convolutional neural network. present method incorporate global orientation information sun visual odometry pipeline using existing image stream, sun typically visible. leverage recent advances bayesian convolutional neural networks train implement sun detection model infers three-dimensional sun direction vector single rgb image. crucially, method also computes principled uncertainty associated prediction, using monte carlo dropout scheme. incorporate uncertainty sliding window stereo visual odometry pipeline accurate uncertainty estimates critical optimal data fusion. bayesian sun detection model achieves median error approximately 12 degrees kitti odometry benchmark training set, yields improvements 42% translational armse 32% rotational armse compared standard vo. open source implementation bayesian cnn sun estimator (sun-bcnn) using caffe available https://github. com/utiasstars/sun-bcnn-vo",4 "punny captions: witty wordplay image descriptions. wit quintessential form rich inter-human interaction, often grounded specific situation (e.g., comment response event). work, attempt build computational models produce witty descriptions given image. inspired cognitive account humor appreciation, employ linguistic wordplay, specifically puns. compare approach meaningful baseline approaches via human studies. turing test style evaluation, people find model's description image wittier human's witty description 55% time!",4 "convsrc: smartphone based periocular recognition using deep convolutional neural network sparsity augmented collaborative representation. smartphone based periocular recognition gained significant attention biometric research community limitations biometric modalities like face, iris etc. existing methods periocular recognition employ hand-crafted features. recently, learning based image representation techniques like deep convolutional neural network (cnn) shown outstanding performance many visual recognition tasks. cnn needs huge volume data learning, periocular recognition limited amount data available. solution use cnn pre-trained dataset related domain, case challenge extract efficiently discriminative features. using pertained cnn model (vgg-net), propose simple, efficient compact image representation technique takes account wealth information sparsity existing activations convolutional layers employs principle component analysis. recognition, use efficient robust sparse augmented collaborative representation based classification (sa-crc) technique. thorough evaluation convsrc (the proposed system), experiments carried visob challenging database presented periocular recognition competition icip2016. obtained results show superiority convsrc state-of-the-art methods; obtains gmr 99% fmr = 10-3 outperforms first winner icip2016 challenge 10%.",4 "efficient circle detection scheme digital images using ant system algorithm. detection geometric features digital images important exercise image analysis computer vision. hough transform techniques detection circles require huge memory space data processing hence requiring lot time computing locations data space, writing searching memory space. paper propose novel efficient scheme detecting circles edge-detected grayscale digital images. use ant-system algorithm purpose yet found much application field. main feature scheme detect intersecting well non-intersecting circles time efficiency makes useful real time applications. build ant system new type finds closed loops image tests circles.",4 "survey techniques improving generalization ability genetic programming solutions. field empirical modeling using genetic programming (gp), important evolve solution good generalization ability. generalization ability gp solutions get affected two important issues: bloat over-fitting. surveyed classified existing literature related different techniques used gp research community deal issues. also point limitation techniques, any. moreover, classification different bloat control approaches measures bloat over-fitting also discussed. believe work useful gp practitioners following ways: (i) better understand concepts generalization gp (ii) comparing existing bloat over-fitting control techniques (iii) selecting appropriate approach improve generalization ability gp evolved solutions.",4 approximate kalman filter q-learning continuous state-space mdps. seek learn effective policy markov decision process (mdp) continuous states via q-learning. given set basis functions state action pairs search corresponding set linear weights minimizes mean bellman residual. algorithm uses kalman filter model estimate weights developed simpler approximate kalman filter model outperforms current state art projected td-learning methods several standard benchmark problems.,4 "power asymmetry binary hashing. approximating binary similarity using hamming distance short binary hashes, show even similarity symmetric, shorter accurate hashes using two distinct code maps. i.e. approximating similarity $x$ $x'$ hamming distance $f(x)$ $g(x')$, two distinct binary codes $f,g$, rather hamming distance $f(x)$ $f(x')$.",4 "genealogical distance diversity estimate evolutionary algorithms. evolutionary edit distance two individuals population, i.e., amount applications genetic operator would take evolutionary process generate one individual starting other, seems like promising estimate diversity said individuals. introduce genealogical diversity, i.e., estimating two individuals' degree relatedness analyzing large, unused parts genome, computationally efficient method approximate measure diversity.",4 "turnover prediction shares using data mining techniques : case study. predicting turnover company ever fluctuating stock market always proved precarious situation certainly difficult task hand. data mining well-known sphere computer science aims extracting meaningful information large databases. however, despite existence many algorithms purpose predicting future trends, efficiency questionable predictions suffer high error rate. objective paper investigate various classification algorithms predict turnover different companies based stock price. authorized dataset predicting turnover taken www.bsc.com included stock market values various companies past 10 years. algorithms investigated using ""r"" tool. feature selection algorithm, boruta, run dataset extract important influential features classification. extracted features, total turnover company predicted using various classification algorithms like random forest, decision tree, svm multinomial regression. prediction mechanism implemented predict turnover company everyday basis hence could help navigate dubious stock market trades. accuracy rate 95% achieved prediction process. moreover, importance stock market attributes established well.",4 "generative adversarial networks using adaptive convolution. existing gans architectures generate images use transposed convolution resize-convolution upsampling algorithm lower higher resolution feature maps generator. argue kind fixed operation problematic gans model objects different visual appearances. propose novel adaptive convolution method learns upsampling algorithm based local context location address problem. modify baseline gans architecture replacing normal convolutions adaptive convolutions generator. experiments cifar-10 dataset show modified models improve baseline model large margin. furthermore, models achieve state-of-the-art performance cifar-10 stl-10 datasets unsupervised setting.",4 "nonextensive information theoretical machine. paper, propose new discriminative model named \emph{nonextensive information theoretical machine (nitm)} based nonextensive generalization shannon information theory. nitm, weight parameters treated random variables. tsallis divergence used regularize distribution weight parameters maximum unnormalized tsallis entropy distribution used evaluate fitting effect. one hand, showed well-known margin-based loss functions $\ell_{0/1}$ loss, hinge loss, squared hinge loss exponential loss unified unnormalized tsallis entropy. hand, gaussian prior regularization generalized student-t prior regularization similar computational complexity. model solved efficiently gradient-based convex optimization performance illustrated standard datasets.",4 "efficient watermarking algorithm improve payload robustness without affecting image perceptual quality. capacity, robustness, & perceptual quality watermark data important issues considered. lot research going increase parameters watermarking digital images, always tradeoff among them. . paper efficient watermarking algorithm improve payload robustness without affecting perceptual quality image data based dwt discussed. aim paper employ nested watermarks wavelet domain increases capacity ultimately robustness attacks selection different scaling factor values & hh bands embedding create visible artifacts original image therefore original watermarked image similar.",4 "shape texture using locally scaled point processes. shape texture refers extraction 3d information 2d images irregular texture. paper introduces statistical framework learn shape texture convex texture elements 2d image represented point process. first step, 2d image preprocessed generate probability map corresponding estimate unnormalized intensity latent point process underlying texture elements. latent point process subsequently inferred probability map non-parametric, model free manner. finally, 3d information extracted point pattern applying locally scaled point process model local scaling function represents deformation caused projection 3d surface onto 2d image.",19 "relativistic monte carlo. hamiltonian monte carlo (hmc) popular markov chain monte carlo (mcmc) algorithm generates proposals metropolis-hastings algorithm simulating dynamics hamiltonian system. however, hmc sensitive large time discretizations performs poorly mismatch spatial geometry target distribution scales momentum distribution. particular mass matrix hmc hard tune well. order alleviate problems propose relativistic hamiltonian monte carlo, version hmc based relativistic dynamics introduce maximum velocity particles. also derive stochastic gradient versions algorithm show resulting algorithms bear interesting relationships gradient clipping, rmsprop, adagrad adam, popular optimisation methods deep learning. based this, develop relativistic stochastic gradient descent taking zero-temperature limit relativistic stochastic gradient hamiltonian monte carlo. experiments show relativistic algorithms perform better classical newtonian variants adam.",19 "relative upper confidence bound k-armed dueling bandit problem. paper proposes new method k-armed dueling bandit problem, variation regular k-armed bandit problem offers relative feedback pairs arms. approach extends upper confidence bound algorithm relative setting using estimates pairwise probabilities select promising arm applying upper confidence bound winner benchmark. prove finite-time regret bound order o(log t). addition, empirical results using real data information retrieval application show greatly outperforms state art.",4 "compressive optical deflectometric tomography: constrained total-variation minimization approach. optical deflectometric tomography (odt) provides accurate characterization transparent materials whose complex surfaces present real challenge manufacture control. odt, refractive index map (rim) transparent object reconstructed measuring light deflection multiple orientations. show imaging modality made ""compressive"", i.e., correct rim reconstruction achievable far less observations required traditional filtered back projection (fbp) methods. assuming cartoon-shape rim model, reconstruction driven minimizing map total-variation fidelity constraint available observations. moreover, two realistic assumptions added improve stability approach: map positivity frontier condition. numerically, method relies accurate odt sensing model primal-dual minimization scheme, including easily sensing operator proposed rim constraints. conclude paper demonstrating power method synthetic experimental data various compressive scenarios. particular, compressiveness stabilized odt problem demonstrated observing typical gain 20 db compared fbp 5% 360 incident light angles moderately noisy sensing.",4 "learning global features coreference resolution. compelling evidence coreference prediction would benefit modeling global information entity-clusters. yet, state-of-the-art performance achieved systems treating mention prediction independently, attribute inherent difficulty crafting informative cluster-level features. instead propose use recurrent neural networks (rnns) learn latent, global representations entity clusters directly mentions. show representations especially useful prediction pronominal mentions, incorporated end-to-end coreference system outperforms state art without requiring additional search.",4 "learning low dimensional convolutional neural networks high-resolution remote sensing image retrieval. learning powerful feature representations image retrieval always challenging task field remote sensing. traditional methods focus extracting low-level hand-crafted features time-consuming also tend achieve unsatisfactory performance due content complexity remote sensing images. paper, investigate extract deep feature representations based convolutional neural networks (cnn) high-resolution remote sensing image retrieval (hrrsir). end, two effective schemes proposed generate powerful feature representations hrrsir. first scheme, deep features extracted fully-connected convolutional layers pre-trained cnn models, respectively; second scheme, propose novel cnn architecture based conventional convolution layers three-layer perceptron. novel cnn model trained large remote sensing dataset learn low dimensional features. two schemes evaluated several public challenging datasets, results indicate proposed schemes particular novel cnn able achieve state-of-the-art performance.",4 "metric learning perspective svm: relation svm lmnn. support vector machines, svms, large margin nearest neighbor algorithm, lmnn, two popular learning algorithms quite different learning biases. paper bring unified view show much stronger relation commonly thought. analyze svms metric learning perspective cast metric learning problem, view helps us uncover relations two algorithms. show lmnn seen learning set local svm-like models quadratic space. along way inspired metric-based interpretation svm derive novel variant svms, epsilon-svm, lmnn even similar. give unified view lmnn different svm variants. finally provide preliminary experiments number benchmark datasets show epsilon-svm compares favorably respect lmnn svm.",4 "geometric primitive feature extraction - concepts, algorithms, applications. thesis presents important insights concepts related topic extraction geometric primitives edge contours digital images. three specific problems related topic studied, viz., polygonal approximation digital curves, tangent estimation digital curves, ellipse fitting anddetection digital curves. problem polygonal approximation, two fundamental problems addressed. first, nature performance evaluation metrics relation local global fitting characteristics studied. second, explicit error bound error introduced digitizing continuous line segment derived used propose generic non-heuristic parameter independent framework used several dominant point detection methods. problem tangent estimation digital curves, simple method tangent estimation proposed. shown method definite upper bound error conic digital curves. shown method performs better almost (seventy two) existing tangent estimation methods conic well several non-conic digital curves. problem fitting ellipses digital curves, geometric distance minimization model considered. unconstrained, linear, non-iterative, numerically stable ellipse fitting method proposed shown proposed method better selectivity elliptic digital curves (high true positive low false positive) compared several ellipse fitting methods. problem detecting ellipses set digital curves, several innovative fast pre-processing, grouping, hypotheses evaluation concepts applicable digital curves proposed combined form ellipse detection method.",4 "accelerated block coordinate proximal gradients applications high dimensional statistics. nonconvex optimization problems arise different research fields arouse lots attention signal processing, statistics machine learning. work, explore accelerated proximal gradient method variants shown converge nonconvex context recently. show novel variant proposed here, exploits adaptive momentum block coordinate update specific update rules, improves performance broad class nonconvex problems. applications sparse linear regression regularizations like lasso, grouped lasso, capped $\ell_1$ scap, proposed scheme enjoys provable local linear convergence, experimental justification.",12 "local procrustes manifold embedding: measure embedding quality embedding algorithms. present procrustes measure, novel measure based procrustes rotation enables quantitative comparison output manifold-based embedding algorithms (such lle (roweis saul, 2000) isomap (tenenbaum et al, 2000)). measure also serves natural tool choosing dimension-reduction parameters. also present two novel dimension-reduction techniques attempt minimize suggested measure, compare results techniques results existing algorithms. finally, suggest simple iterative method used improve output existing algorithms.",19 "general algorithm deciding transportability experimental results. generalizing empirical findings new environments, settings, populations essential scientific explorations. article treats particular problem generalizability, called ""transportability"", defined license transfer information learned experimental studies different population, observational studies conducted. given set assumptions concerning commonalities differences two populations, pearl bareinboim (2011) derived sufficient conditions permit transfer take place. article summarizes findings supplements effective procedure deciding transportability feasible. establishes necessary sufficient condition deciding causal effects target population estimable statistical information available causal information transferred experiments. article provides complete algorithm computing transport formula, is, way combining observational experimental information synthesize bias-free estimate desired causal relation. finally, article examines differences transportability variants generalizability.",4 "convolutional neural networks joint object detection pose estimation: comparative study. paper study application convolutional neural networks jointly detecting objects depicted still images estimating 3d pose. identify different feature representations oriented objects, energies lead network learn representations. choice representation crucial since pose object natural, continuous structure category discrete variable. evaluate different approaches joint object detection pose estimation task pascal3d+ benchmark using average viewpoint precision. show classification approach discretized viewpoints achieves state-of-the-art performance joint object detection pose estimation, significantly outperforms existing baselines benchmark.",4 "maximum production transmission messages rate service discovery protocols. minimizing number dropped user datagram protocol (udp) messages network regarded challenge researchers. issue represents serious problems many protocols particularly depend sending messages part strategy, us service discovery protocols. paper proposes evaluates algorithm predict minimum period time required two consecutive messages suggests minimum queue sizes routers, manage traffic minimise number dropped messages caused either congestion queue overflow together. algorithm applied universal plug play (upnp) protocol using ns2 simulator. tested routers connected two configurations; centralized de centralized. message length bandwidth links among routers taken consideration. result shows better improvement number dropped messages `among routers.",4 "innateness, alphazero, artificial intelligence. concept innateness rarely discussed context artificial intelligence. discussed, hinted at, often context trying reduce amount innate machinery given system. paper, consider test case recent series papers silver et al (silver et al., 2017a) alphago successors presented argument ""even challenging domains: possible train superhuman level, without human examples guidance"", ""starting tabula rasa."" argue claims overstated, multiple reasons. close arguing artificial intelligence needs greater attention innateness, point proposals innateness might look like.",4 "generalization without systematicity: compositional skills sequence-to-sequence recurrent networks. humans understand produce new utterances effortlessly, thanks compositional skills. person learns meaning new verb ""dax,"" immediately understand meaning ""dax twice"" ""sing dax."" paper, introduce scan domain, consisting set simple compositional navigation commands paired corresponding action sequences. test zero-shot generalization capabilities variety recurrent neural networks (rnns) trained scan sequence-to-sequence methods. find rnns make successful zero-shot generalizations differences training test commands small, apply ""mix-and-match"" strategies solve task. however, generalization requires systematic compositional skills (as ""dax"" example above), rnns fail spectacularly. conclude proof-of-concept experiment neural machine translation, suggesting lack systematicity might partially responsible neural networks' notorious training data thirst.",4 "locality low-dimensions prediction natural experience fmri. functional magnetic resonance imaging (fmri) provides dynamical access complex functioning human brain, detailing hemodynamic activity thousands voxels hundreds sequential time points. one approach towards illuminating connection fmri cognitive function decoding; time series voxel activities combine provide information internal external experience? seek models fmri decoding balanced simplicity interpretation effectiveness prediction. use signals subject immersed virtual reality compare global local methods prediction applying linear nonlinear techniques dimensionality reduction. find prediction complex stimuli remarkably low-dimensional, saturating less 100 features. particular, build effective models based decorrelated components cognitive activity classically-defined brodmann areas. stimuli, top predictive areas surprisingly transparent, including wernicke's area verbal instructions, visual cortex facial body features, visual-temporal regions velocity. direct sensory experience resulted robust predictions, highest correlation ($c \sim 0.8$) predicted experienced time series verbal instructions. techniques based non-linear dimensionality reduction (laplacian eigenmaps) performed similarly. interpretability relative simplicity approach provides conceptual basis upon build sophisticated techniques fmri decoding offers window cognitive function dynamic, natural experience.",16 "convex optimization big data. article reviews recent advances convex optimization algorithms big data, aim reduce computational, storage, communications bottlenecks. provide overview emerging field, describe contemporary approximation techniques like first-order methods randomization scalability, survey important role parallel distributed computation. new big data algorithms based surprisingly simple principles attain staggering accelerations even classical problems.",12 "data mining concept ""end world"" twitter microblogs. paper describes analysis quantitative characteristics frequent sets association rules posts twitter microblogs, related discussion ""end world"", allegedly predicted december 21, 2012 due mayan calendar. discovered frequent sets association rules characterize semantic relations concepts analyzed subjects.the support fequent sets reaches global maximum expected event time delay. frequent sets may considered predictive markers characterize significance expected events blogosphere users. shown time dynamics confidence revealed association rules also predictive characteristics. exceeding certain threshold, may signal corresponding reaction society time interval maximum probable coming event.",4 "improving performance english-tamil statistical machine translation system using source-side pre-processing. machine translation one major oldest active research area natural language processing. currently, statistical machine translation (smt) dominates machine translation research. statistical machine translation approach machine translation uses models learn translation patterns directly data, generalize translate new unseen text. smt approach largely language independent, i.e. models applied language pair. statistical machine translation (smt) attempts generate translations using statistical methods based bilingual text corpora. corpora available, excellent results attained translating similar texts, corpora still available many language pairs. statistical machine translation systems, general, difficulty handling morphology source target side especially morphologically rich languages. errors morphology syntax target language severe consequences meaning sentence. change grammatical function words understanding sentence incorrect tense information verb. baseline smt also known phrase based statistical machine translation (pbsmt) system use linguistic information operates surface word form. recent researches shown adding linguistic information helps improve accuracy translation less amount bilingual corpora. adding linguistic information done using factored statistical machine translation system pre-processing steps. paper investigates english side pre-processing used improve accuracy english-tamil smt system.",4 "semi-structured data extraction modelling: wia project. last decades, amount data kinds available electronically increased dramatically. data accessible range interfaces including web browsers, database query languages, application-specific interfaces, built top number different data exchange formats. data span un-structured highly structured data. often, structure even structure implicit, rigid regular found standard database systems. spreadsheet documents prototypical respect. spreadsheets lightweight technology able supply companies easy build business management business intelligence applications, business people largely adopt spreadsheets smart vehicles data files generation sharing. actually, spreadsheets grow complexity (e.g., use product development plans quoting), arrangement, maintenance, analysis appear knowledge-driven activity. algorithmic approach problem automatic data structure extraction spreadsheet documents (i.e., grid-structured free topological-related data) emerges wia project: worksheets intelligent analyser. wia-algorithm shows provide description spreadsheet contents terms higher level abstractions conceptualisations. particular, wia-algorithm target extraction i) calculus work-flow implemented spreadsheets formulas ii) logical role played data take part calculus. aim resulting conceptualisations provide spreadsheets abstract representations useful model refinements optimizations evolutionary algorithms computations.",4 "characterizing maximum parameter total-variation denoising pseudo-inverse divergence. focus maximum regularization parameter anisotropic total-variation denoising. corresponds minimum value regularization parameter solution remains constant. value well know lasso, critical value investigated details total-variation. though, importance tuning regularization parameter allows fixing upper-bound grid optimal parameter sought. establish closed form expression one-dimensional case, well upper-bound two-dimensional case, appears reasonably tight practice. problem directly linked computation pseudo-inverse divergence, quickly obtained performing convolutions fourier domain.",19 "incremental maintenance association rules support threshold change. maintenance association rules interesting problem. several incremental maintenance algorithms proposed since work (cheung et al, 1996). majority algorithms maintain rule bases assuming support threshold change. paper, present incremental maintenance algorithm support threshold change. solution allows user maintain rule base support threshold.",4 "cognitive mind-map framework foster trust. explorative mind-map dynamic framework, emerges automatically input, gets. unlike verificative modeling system existing (human) thoughts placed connected together. regard, explorative mind-maps change size continuously, adaptive connectionist cells inside; mind-maps process data input incrementally offer lots possibilities interact user appropriate communication interface. respect cognitive motivated situation like conversation partners, mind-maps become interesting able process stimulating signals whenever occur. signals close understanding world, conversational partner becomes automatically trustful signals less match knowledge scheme. (position) paper, therefore motivate explorative mind-maps cognitive engine propose decision support engine foster trust.",4 "context aware nonnegative matrix factorization clustering. article propose method refine clustering results obtained nonnegative matrix factorization (nmf) technique, imposing consistency constraints final labeling data. research community focused effort initialization optimization part method, without paying attention final cluster assignments. propose game theoretic framework object clustered represented player, choose cluster membership. information obtained nmf used initialize strategy space players weighted graph used model interactions among players. interactions allow players choose cluster coherent clusters chosen similar players, property guaranteed nmf, since produces soft clustering data. results common benchmarks show model able improve performances many nmf formulations.",4 "adaptive admm spectral penalty parameter selection. alternating direction method multipliers (admm) versatile tool solving wide range constrained optimization problems, differentiable non-differentiable objective functions. unfortunately, performance highly sensitive penalty parameter, makes admm often unreliable hard automate non-expert user. tackle weakness admm proposing method adaptively tune penalty parameters achieve fast convergence. resulting adaptive admm (aadmm) algorithm, inspired successful barzilai-borwein spectral method gradient descent, yields fast convergence relative insensitivity initial stepsize problem scaling.",4 "a-ward_p\b{eta}: effective hierarchical clustering using minkowski metric fast k -means initialisation. paper make two novel contributions hierarchical clustering. first, introduce anomalous pattern initialisation method hierarchical clustering algorithms, called a-ward, capable substantially reducing time take converge. method generates initial partition sufficiently large number clusters. allows cluster merging process start partition rather trivial partition composed solely singletons. second contribution extension ward ward p algorithms situation feature weight exponent differ exponent minkowski distance. new method, called a-ward p\b{eta} , able generate much wider variety clustering solutions. also demonstrate parameters estimated reasonably well using cluster validity index. perform numerous experiments using data sets two types noise, insertion noise features blurring within-cluster values features. experiments allow us conclude: (i) anomalous pattern initialisation method indeed reduce time hierarchical clustering algorithm takes complete, without negatively impacting cluster recovery ability; (ii) a-ward p\b{eta} provides better cluster recovery ward ward p.",4 "méthodes pour la représentation informatisée de données lexicales / methoden der speicherung lexikalischer daten. recent years, new developments area lexicography altered management, processing publishing lexicographical data, also created new types products electronic dictionaries thesauri. expand range possible uses lexical data support users flexibility, instance assisting human translation. article, give short easy-to-understand introduction problematic nature storage, display interpretation lexical data. describe main methods specifications used build represent lexical data. paper targeted following groups people: linguists, lexicographers, specialists, computer linguists others wish learn modelling, representation visualization lexical knowledge. paper written two languages: french german.",4 "lower bound analysis population-based evolutionary algorithms pseudo-boolean functions. evolutionary algorithms (eas) population-based general-purpose optimization algorithms, successfully applied various real-world optimization tasks. however, previous theoretical studies often employ eas parent offspring population focus specific problems. furthermore, often show upper bounds running time, lower bounds also necessary get complete understanding algorithm. paper, analyze running time ($\mu$+$\lambda$)-ea (a general population-based ea mutation only) class pseudo-boolean functions unique global optimum. applying recently proposed switch analysis approach, prove lower bound $\omega(n \ln n+ \mu + \lambda n\ln\ln n/ \ln n)$ first time. particularly two widely-studied problems, onemax leadingones, derived lower bound discloses ($\mu$+$\lambda$)-ea strictly slower (1+1)-ea population size $\mu$ $\lambda$ moderate order. results imply increase population size, usually desired practice, bears risk increasing lower bound running time thus carefully considered.",4 "time stretch inspired computational imaging. show dispersive propagation light followed phase detection properties exploited extracting features waveforms. discovery spearheading development new class physics-inspired algorithms feature extraction digital images unique properties superior dynamic range compared conventional algorithms. certain cases, algorithms potential energy efficient scalable substitute synthetically fashioned computational techniques practice today.",4 numerical weather prediction stochastic modeling: objective criterion choice global radiation forecasting. numerous methods exist developed global radiation forecasting. two popular types numerical weather predictions (nwp) predictions using stochastic approaches. propose compute parameter noted constructed part mutual information quantity measures mutual dependence two variables. calculated objective establish relevant method nwp stochastic models concerning current problem.,19 "dirichlet fragmentation processes. tree structures ubiquitous data across many domains, many datasets naturally modelled unobserved tree structures. paper, first review theory random fragmentation processes [bertoin, 2006], number existing methods modelling trees, including popular nested chinese restaurant process (ncrp). define general class probability distributions trees: dirichlet fragmentation process (dfp) novel combination theory dirichlet processes random fragmentation processes. dfp presents stick-breaking construction, relates ncrp way dirichlet process relates chinese restaurant process. furthermore, develop novel hierarchical mixture model dfp, empirically compare new model similar models machine learning. experiments show dfp mixture model convincingly better existing state-of-the-art approaches hierarchical clustering density modelling.",19 "gaussian processes data-efficient learning robotics control. autonomous learning promising direction control robotics decade since data-driven learning allows reduce amount engineering knowledge, otherwise required. however, autonomous reinforcement learning (rl) approaches typically require many interactions system learn controllers, practical limitation real systems, robots, many interactions impractical time consuming. address problem, current learning approaches typically require task-specific knowledge form expert demonstrations, realistic simulators, pre-shaped policies, specific knowledge underlying dynamics. article, follow different approach speed learning extracting information data. particular, learn probabilistic, non-parametric gaussian process transition model system. explicitly incorporating model uncertainty long-term planning controller learning approach reduces effects model errors, key problem model-based learning. compared state-of-the art rl model-based policy search method achieves unprecedented speed learning. demonstrate applicability autonomous learning real robot control tasks.",19 "geometric decision tree. paper present new algorithm learning oblique decision trees. current decision tree algorithms rely impurity measures assess goodness hyperplanes node learning decision tree top-down fashion. impurity measures properly capture geometric structures data. motivated this, algorithm uses strategy assess hyperplanes way geometric structure data taken account. node decision tree, find clustering hyperplanes classes use angle bisectors split rule node. show empirical studies idea leads small decision trees better performance. also present analysis show angle bisectors clustering hyperplanes use split rules node, solutions interesting optimization problem hence argue principled method learning decision tree.",4 "generalization error bounds probabilistic guarantee sgd nonconvex optimization. success deep learning led rising interest generalization property stochastic gradient descent (sgd) method, stability one popular approach study it. existing works based stability studied nonconvex loss functions, considered generalization error sgd expectation. paper, establish various generalization error bounds probabilistic guarantee sgd. specifically, general nonconvex loss functions gradient dominant loss functions, characterize on-average stability iterates generated sgd terms on-average variance stochastic gradients. characterization leads improved bounds generalization error sgd. study regularized risk minimization problem strongly convex regularizers, obtain improved generalization error bounds proximal sgd. strongly convex regularizers, establish generalization error bounds nonconvex loss functions proximal sgd high-probability guarantee, i.e., exponential concentration probability.",19 "improving vision-based self-positioning intelligent transportation systems via integrated lane vehicle detection. traffic congestion widespread problem. dynamic traffic routing systems congestion pricing getting importance recent research. lane prediction vehicle density estimation important component systems. introduce novel problem vehicle self-positioning involves predicting number lanes road vehicle's position lanes using videos captured dashboard camera. propose integrated closed-loop approach use presence vehicles aid task self-positioning vice-versa. incorporate multiple factors high-level semantic knowledge solution, formulate problem bayesian framework. framework, number lanes, vehicle's position lanes presence vehicles considered parameters. also propose bounding box selection scheme reduce number false detections increase computational efficiency. show number box proposals decreases factor 6 using selection approach. also results large reduction number false detections. entire approach tested real-world videos found give acceptable results.",4 "relaxation graph coloring satisfiability problems. using t=0 monte carlo simulation, study relaxation graph coloring (k-col) satisfiability (k-sat), two hard problems recently shown possess phase transition solvability parameter varied. change exponentially fast power law relaxation, transition freezing behavior found. changes take place smaller values parameter solvability transition. results coloring problem colorable clustered graphs fraction persistent spins satisfiability also presented.",3 "early human visual system compete deep neural networks?. study compare human visual system state-of-the-art deep neural networks classification distorted images. different previous works, limit display time 100ms test early mechanisms human visual system, without allowing time eye movements higher level processes. findings show human visual system still outperforms modern deep neural networks blurry noisy images. findings motivate future research developing robust deep networks.",4 "using fast weights attend recent past. recently, research artificial neural networks largely restricted systems two types variable: neural activities represent current recent input weights learn capture regularities among inputs, outputs payoffs. good reason restriction. synapses dynamics many different time-scales suggests artificial neural networks might benefit variables change slower activities much faster standard weights. ""fast weights"" used store temporary memories recent past provide neurally plausible way implementing type attention past recently proved helpful sequence-to-sequence models. using fast weights avoid need store copies neural activity patterns.",19 "super-resolution wavelet-encoded images. multiview super-resolution image reconstruction (srir) often cast resampling problem merging non-redundant data multiple low-resolution (lr) images finer high-resolution (hr) grid, inverting effect camera point spread function (psf). one main problem multiview methods resampling nonuniform samples (provided lr images) inversion psf highly nonlinear ill-posed problems. non-linearity ill-posedness typically overcome linearization regularization, often iterative optimization process, essentially trade information (i.e. high frequency) want recover. propose novel point view multiview srir: unlike existing multiview methods reconstruct entire spectrum hr image multiple given lr images, derive explicit expressions show high-frequency spectra unknown hr image related spectra lr images. therefore, taking lr images reference represent low-frequency spectra hr image, one reconstruct super-resolution image focusing reconstruction high-frequency spectra. much like single-image methods, extrapolate spectrum one image, except rely information provided views, rather prior constraints single-image methods (which may accurate source information). made possible deriving applying explicit closed-form expressions define local high frequency information aim recover reference high resolution image related local low frequency information sequence views. results comparisons recently published state-of-the-art methods show superiority proposed solution.",4 "learning without concentration. obtain sharp bounds performance empirical risk minimization performed convex class respect squared loss, without assuming class members target bounded functions rapidly decaying tails. rather resorting concentration-based argument, method used relies `small-ball' assumption thus holds classes consisting heavy-tailed functions heavy-tailed targets. resulting estimates scale correctly `noise level' problem, applied classical, bounded scenario, always improve known bounds.",4 "general framework recognition online handwritten graphics. propose new framework recognition online handwritten graphics. three main features framework ability treat symbol structural level information integrated way, flexibility respect different families graphics, means control tradeoff recognition effectiveness computational cost. model graphic labeled graph generated graph grammar. non-terminal vertices represent subcomponents, terminal vertices represent symbols, edges represent relations subcomponents symbols. model recognition problem graph parsing problem: given input stroke set, search parse tree represents best interpretation input. graph parsing algorithm generates multiple interpretations (consistent grammar) extract optimal interpretation according cost function takes consideration likelihood scores symbols structures. parsing algorithm consists recursively partitioning stroke set according structures defined grammar impose constraints present previous works (e.g. stroke ordering). avoiding constraints thanks powerful representativeness graphs, approach adapted recognition different graphic notations. show applications recognition mathematical expressions flowcharts. experimentation shows method obtains state-of-the-art accuracy applications.",4 rule-based query answering method knowledge base economic crimes. present description phd thesis aims propose rule-based query answering method relational data. approach use additional knowledge represented set rules describes source data concept (ontological) level. queries posed terms abstract level. present two methods. first one uses hybrid reasoning second one exploits forward chaining. two methods demonstrated prototypical implementation system coupled jess engine. tests performed knowledge base selected economic crimes: fraudulent disbursement money laundering.,4 "autonomous quantum perceptron neural network. recently, rapid development technology, lot applications require achieve low-cost learning. however computational power classical artificial neural networks, capable provide low-cost learning. contrast, quantum neural networks may representing good computational alternate classical neural network approaches, based computational power quantum bit (qubit) classical bit. paper present new computational approach quantum perceptron neural network achieve learning low-cost computation. proposed approach one neuron construct self-adaptive activation operators capable accomplish learning process limited number iterations and, thereby, reduce overall computational cost. proposed approach capable construct set activation operators applied widely quantum classical applications overcome linearity limitation classical perceptron. computational power proposed approach illustrated via solving variety problems promising comparable results given.",4 "sketching large-scale learning mixture models. learning parameters voluminous data prohibitive terms memory computational requirements. propose ""compressive learning"" framework estimate model parameters sketch training data. sketch collection generalized moments underlying probability distribution data. computed single pass training set, easily computable streams distributed datasets. proposed framework shares similarities compressive sensing, aims drastically reducing dimension high-dimensional signals preserving ability reconstruct them. perform estimation task, derive iterative algorithm analogous sparse reconstruction algorithms context linear inverse problems. exemplify framework compressive estimation gaussian mixture model (gmm), providing heuristics choice sketching procedure theoretical guarantees reconstruction. experimentally show synthetic data proposed algorithm yields results comparable classical expectation-maximization (em) technique requiring significantly less memory fewer computations number database elements large. demonstrate potential approach real large-scale data (over 10 8 training samples) task model-based speaker verification. finally, draw connections proposed framework approximate hilbert space embedding probability distributions using random features. show proposed sketching operator seen innovative method design translation-invariant kernels adapted analysis gmms. also use theoretical framework derive information preservation guarantees, spirit infinite-dimensional compressive sensing.",4 "one-pass person re-identification sketch online discriminant analysis. person re-identification (re-id) match people across disjoint camera views multi-camera system, re-id important technology applied smart city recent years. however, majority existing person re-id methods designed processing sequential data online way. ignores real-world scenario person images detected multi-cameras system coming sequentially. work discussing online re-id, require considerable storage passed data samples ever observed, could unrealistic processing data large camera network. work, present onepass person re-id model adapts re-id model based newly observed data passed data directly used update. specifically, develop sketch online discriminant analysis (soda) embedding sketch processing fisher discriminant analysis (fda). soda efficiently keep main data variations passed samples low rank matrix processing sequential data samples, estimate approximate within-class variance (i.e. within-class covariance matrix) sketch data information. provide theoretical analysis effect estimated approximate within-class covariance matrix. particular, derive upper lower bounds fisher discriminant score (i.e. quotient between-class variation within-class variation feature transformation) order investigate optimal feature transformation learned soda sequentially approximates offline fda learned observed data. extensive experimental results shown effectiveness soda empirically support theoretical analysis.",4 "image enhancement statistical estimation. contrast enhancement important area research image analysis. decade, researcher worked domain develop efficient adequate algorithm. proposed method enhance contrast image using binarization method help maximum likelihood estimation (mle). paper aims enhance image contrast bimodal multi-modal images. proposed methodology use collect mathematical information retrieves image. paper, using binarization method generates desired histogram separating image nodes. generates enhanced image using histogram specification binarization method. proposed method showed improvement image contrast enhancement compare image.",4 "os* algorithm: joint approach exact optimization sampling. current sampling algorithms high-dimensional distributions based mcmc techniques approximate sense valid asymptotically. rejection sampling, hand, produces valid samples, unrealistically slow high-dimension spaces. os* algorithm propose unified approach exact optimization sampling, based incremental refinements functional upper bound, combines ideas adaptive rejection sampling a* optimization search. show choice refinement done way ensures tractability high-dimension spaces, present first experiments two different settings: inference high-order hmms large discrete graphical models.",4 "lego: learning edge geometry watching videos. learning estimate 3d geometry single image watching unlabeled videos via deep convolutional network attracting significant attention. paper, introduce ""3d as-smooth-as-possible (3d-asap)"" priori inside pipeline, enables joint estimation edges 3d scene, yielding results significant improvement accuracy fine detailed structures. specifically, define 3d-asap priori requiring two points recovered 3d image lie existing planar surface cues provided. design unsupervised framework learns edges geometry (depth, normal) (lego). predicted edges embedded depth surface normal smoothness terms, pixels without edges in-between constrained satisfy priori. framework, predicted depths, normals edges forced consistent time. conduct experiments kitti evaluate estimated geometry cityscapes perform edge evaluation. show tasks, i.e.depth, normal edge, algorithm vastly outperforms state-of-the-art (sota) algorithms, demonstrating benefits approach.",4 "zipf's law word frequencies: word forms versus lemmas long texts. zipf's law fundamental paradigm statistics written spoken natural language well communication systems. raise question elementary units zipf's law hold natural way, studying validity plain word forms corresponding lemma forms. order homogeneous sources possible, analyze longest literary texts ever written, comprising four different languages, different levels morphological complexity. cases zipf's law fulfilled, sense power-law distribution word lemma frequencies valid several orders magnitude. investigate extent word-lemma transformation preserves two parameters zipf's law: exponent low-frequency cut-off. able demonstrate strict invariance tail, texts exponents deviate significantly, conclude exponents similar, despite remarkable transformation going words lemmas represents, considerably affecting ranges frequencies. contrast, low-frequency cut-offs less stable.",15 "context-aware generative adversarial privacy. preserving utility published datasets simultaneously providing provable privacy guarantees well-known challenge. one hand, context-free privacy solutions, differential privacy, provide strong privacy guarantees, often lead significant reduction utility. hand, context-aware privacy solutions, information theoretic privacy, achieve improved privacy-utility tradeoff, assume data holder access dataset statistics. circumvent limitations introducing novel context-aware privacy framework called generative adversarial privacy (gap). gap leverages recent advancements generative adversarial networks (gans) allow data holder learn privatization schemes dataset itself. gap, learning privacy mechanism formulated constrained minimax game two players: privatizer sanitizes dataset way limits risk inference attacks individuals' private variables, adversary tries infer private variables sanitized dataset. evaluate gap's performance, investigate two simple (yet canonical) statistical dataset models: (a) binary data model, (b) binary gaussian mixture model. models, derive game-theoretically optimal minimax privacy mechanisms, show privacy mechanisms learned data (in generative adversarial fashion) match theoretically optimal ones. demonstrates framework easily applied practice, even absence dataset statistics.",4 "crowdsourcing ground truth medical relation extraction. cognitive computing systems require human labeled data evaluation, often training. standard practice used gathering data minimizes disagreement annotators, found results data fails account ambiguity inherent language. proposed crowdtruth method collecting ground truth crowdsourcing, reconsiders role people machine learning based observation disagreement annotators provides useful signal phenomena ambiguity text. report using method build annotated data set medical relation extraction $cause$ $treat$ relations, data performed supervised training experiment. demonstrate modeling ambiguity, labeled data gathered crowd workers (1) reach level quality domain experts task reducing cost, (2) provide better training data scale distant supervision. propose validate new weighted measures precision, recall, f-measure, account ambiguity human machine performance task.",4 "churn prediction mobile social games: towards complete assessment using survival ensembles. reducing user attrition, i.e. churn, broad challenge faced several industries. mobile social games, decreasing churn decisive increase player retention rise revenues. churn prediction models allow understand player loyalty anticipate stop playing game. thanks predictions, several initiatives taken retain players likely churn. survival analysis focuses predicting time occurrence certain event, churn case. classical methods, like regressions, could applied players left game. challenge arises datasets incomplete churning information players, still connect game. called censored data problem nature churn. censoring commonly dealt survival analysis techniques, due inflexibility survival statistical algorithms, accuracy achieved often poor. contrast, novel ensemble learning techniques, increasingly popular variety scientific fields, provide high-class prediction results. work, develop, first time social games domain, survival ensemble model provides comprehensive analysis together accurate prediction churn. player, predict probability churning function time, permits distinguish various levels loyalty profiles. additionally, assess risk factors explain predicted player survival times. results show churn prediction survival ensembles significantly improves accuracy robustness traditional analyses, like cox regression.",19 "design intelligent agents based system commodity market simulation jade. market potato commodity industry scale usage engaging several types actors. farmers, middlemen, industries. multi-agent system built simulate actors agent entities, based manually given parameters within simulation scenario file. type agents fuzzy logic representing actual actors' knowledge, used interpreting values take appropriated decision simulation. system simulate market activities programmed behaviors produce results spreadsheet chart graph files. results consist agent's yearly finance commodity data. system also predict next value outputs.",4 "efficient algorithm learning semi-bandit feedback. consider problem online combinatorial optimization semi-bandit feedback. goal learner sequentially select actions combinatorial decision set minimize cumulative loss. propose learning algorithm problem based combining follow-the-perturbed-leader (fpl) prediction method novel loss estimation procedure called geometric resampling (gr). contrary previous solutions, resulting algorithm efficiently implemented decision set efficient offline combinatorial optimization possible all. assuming elements decision set described d-dimensional binary vectors non-zero entries, show expected regret algorithm rounds o(m sqrt(dt log d)). side result, also improve best known regret bounds fpl full information setting o(m^(3/2) sqrt(t log d)), gaining factor sqrt(d/m) previous bounds algorithm.",4 "hybrid model solving multi-objective problems using evolutionary algorithm tabu search. paper presents new multi-objective hybrid model makes cooperation strength research neighborhood methods presented tabu search (ts) important exploration capacity evolutionary algorithm. model implemented tested benchmark functions (zdt1, zdt2, zdt3), using network computers.",4 "combining models approximation partial learning. gold's framework inductive inference, model partial learning requires learner output exactly one correct index target object target object infinitely often. since infinitely many learner's hypotheses may incorrect, obvious whether partial learner modifed ""approximate"" target object. fulk jain (approximate inference scientific method. information computation 114(2):179--191, 1994) introduced model approximate learning recursive functions. present work extends research solves open problem fulk jain showing learner approximates partially identifies every recursive function outputting sequence hypotheses which, addition, also almost finite variants target function. subsequent study dedicated question findings generalise learning r.e. languages positive data. three variants approximate learning introduced investigated respect question whether combined partial learning. following line fulk jain's research, investigations provide conditions partial language learners eventually output finite variants target language. combinabilities partial learning criteria also briefly studied.",4 "corpus annotation parser evaluation. describe recently developed corpus annotation scheme evaluating parsers avoids shortcomings current methods. scheme encodes grammatical relations heads dependents, used mark new public-domain corpus naturally occurring english text. show corpus used evaluate accuracy robust parser, relate corpus extant resources.",4 "linearly parameterized bandits. consider bandit problems involving large (possibly infinite) collection arms, expected reward arm linear function $r$-dimensional random vector $\mathbf{z} \in \mathbb{r}^r$, $r \geq 2$. objective minimize cumulative regret bayes risk. set arms corresponds unit sphere, prove regret bayes risk order $\theta(r \sqrt{t})$, establishing lower bound arbitrary policy, showing matching upper bound obtained policy alternates exploration exploitation phases. phase-based policy also shown effective set arms satisfies strong convexity condition. case general set arms, describe near-optimal policy whose regret bayes risk admit upper bounds form $o(r \sqrt{t} \log^{3/2} t)$.",4 "iterative closest point method measuring level similarity 3d log scans wood industry. canadian's lumber industry, simulators used predict lumbers resulting sawing log given sawmill. giving log several logs' 3d scans input, simulators perform real-time job predict lumbers. simulators, however, tend slow processing large volume wood. thus explore alternative approximation techniques based iterative closest point (icp) algorithm identify already processed log unseen log resembles most. main benefit icp approach easily handle 3d scans variable number points. compare icp-based nearest neighbor predictor, predictors built using machine learning algorithms k-nearest-neighbor (knn) random forest (rf). implemented icp-based predictor enabled us identify key points using 3d scans directly distance calculation. long-term goal ongoing research integrated icp distance calculations machine learning.",4 "deep motion features visual tracking. robust visual tracking challenging computer vision problem, many real-world applications. existing approaches employ hand-crafted appearance features, hog color names. recently, deep rgb features extracted convolutional neural networks successfully applied tracking. despite success, features capture appearance information. hand, motion cues provide discriminative complementary information improve tracking performance. contrary visual tracking, deep motion features successfully applied action recognition video classification tasks. typically, motion features learned training cnn optical flow images extracted large amounts labeled videos. paper presents investigation impact deep motion features tracking-by-detection framework. show hand-crafted, deep rgb, deep motion features contain complementary information. best knowledge, first propose fusing appearance information deep motion features visual tracking. comprehensive experiments clearly suggest fusion approach deep motion features outperforms standard methods relying appearance information alone.",4 "causal inference multivariate mixed-type data. given data joint distribution two random variables $x$ $y$, consider problem inferring likely causal direction $x$ $y$. particular, consider general case $x$ $y$ may univariate multivariate, mixed data types. take information theoretic approach, based kolmogorov complexity, follows first describing data cause effect given cause shorter reverse direction. ideal score computable, approximated minimum description length (mdl) principle. based mdl, propose two scores, one $x$ $y$ single data type, one mixed-type. model dependencies $x$ $y$ using classification regression trees. inferring optimal model np-hard, propose crack, fast greedy algorithm determine likely causal direction directly data. empirical evaluation wide range data shows crack reliably, high accuracy, infers correct causal direction univariate multivariate cause-effect pairs single mixed-type data.",19 "scout-it: interior tomography using modified scout acquisition. global scout views previously used reduce interior reconstruction artifacts high-resolution micro-ct c-arm systems. however methods cannot directly used all-important domain clinical ct. ct scan truncated, scout views also truncated. however many cases truncation clinical ct involve partial truncation, anterio-posterior (ap) scout truncated, medio-lateral (ml) scout non-truncated. paper, show cases partially truncated ct scans, modified configuration may used acquire non-truncated ap scout view, ultimately allow highly accurate interior reconstruction.",16 "collaborative receptive field learning. challenge object categorization images largely due arbitrary translations scales foreground objects. attack difficulty, propose new approach called collaborative receptive field learning extract specific receptive fields (rf's) regions multiple images, selected rf's supposed focus foreground objects common category. end, solve problem maximizing submodular function similarity graph constructed pool rf candidates. however, measuring pairwise distance rf's building similarity graph nontrivial problem. hence, introduce similarity metric called pyramid-error distance (ped) measure pairwise distances summing pyramid-like matching errors set low-level features. besides, consistent proposed ped, construct simple nonparametric classifier classification. experimental results show method effectively discovers foreground objects images, improves classification performance.",4 "optimizing recurrent neural networks architectures time constraints. recurrent neural network (rnn)'s architecture key factor influencing performance. propose algorithms optimize hidden sizes running time constraint. convert discrete optimization subset selection problem. novel transformations, objective function becomes submodular constraint becomes supermodular. greedy algorithm bounds suggested solve transformed problem. show transformations influence bounds. speed optimization, surrogate functions proposed balance exploration exploitation. experiments show algorithms find accurate models faster models manually tuned state-of-the-art random search. also compare popular rnn architectures using algorithms.",19 "production probabilistic entropy structure/action contingency relations. luhmann (1984) defined society communication system structurally coupled to, aggregate of, human action systems. communication system considered self-organizing (""autopoietic""), human actors. communication systems studied using shannon's (1948) mathematical theory communication. update network action one local nodes well-known problem artificial intelligence (pearl 1988). combining various theories, general algorithm probabilistic structure/action contingency derived. consequences contingency system, consequences histories, stabilization side counterbalancing mechanisms discussed, mathematical theoretical terms. empirical example elaborated.",4 "multilabel classification ranking partial feedback. present novel multilabel/ranking algorithm working partial information settings. algorithm based 2nd-order descent methods, relies upper-confidence bounds trade-off exploration exploitation. analyze algorithm partial adversarial setting, covariates adversarial, multilabel probabilities ruled (generalized) linear models. show o(t^{1/2} log t) regret bounds, improve several ways existing results. test effectiveness upper-confidence scheme contrasting full-information baselines real-world multilabel datasets, often obtaining comparable performance.",4 "approximation algorithms $\ell_0$-low rank approximation. study $\ell_0$-low rank approximation problem, goal is, given $m \times n$ matrix $a$, output rank-$k$ matrix $a'$ $\|a'-a\|_0$ minimized. here, matrix $b$, $\|b\|_0$ denotes number non-zero entries. np-hard variant low rank approximation natural problems underlying metric, goal minimize number disagreeing data positions. provide approximation algorithms significantly improve running time approximation factor previous work. $k > 1$, show find, poly$(mn)$ time every $k$, rank $o(k \log(n/k))$ matrix $a'$ $\|a'-a\|_0 \leq o(k^2 \log(n/k)) \mathrm{opt}$. best knowledge, first algorithm provable guarantees $\ell_0$-low rank approximation problem $k > 1$, even bicriteria algorithms. well-studied case $k = 1$, give $(2+\epsilon)$-approximation {\it sublinear time}, impossible variants low rank approximation frobenius norm. strengthen well-studied case binary matrices obtain $(1+o(\psi))$-approximation sublinear time, $\psi = \mathrm{opt}/\lvert a\rvert_0$. small $\psi$, approximation factor $1+o(1)$.",4 "learning spatio-temporal representation pseudo-3d residual networks. convolutional neural networks (cnn) regarded powerful class models image recognition problems. nevertheless, trivial utilizing cnn learning spatio-temporal video representation. studies shown performing 3d convolutions rewarding approach capture spatial temporal dimensions videos. however, development deep 3d cnn scratch results expensive computational cost memory demand. valid question recycle off-the-shelf 2d networks 3d cnn. paper, devise multiple variants bottleneck building blocks residual learning framework simulating $3\times3\times3$ convolutions $1\times3\times3$ convolutional filters spatial domain (equivalent 2d cnn) plus $3\times1\times1$ convolutions construct temporal connections adjacent feature maps time. furthermore, propose new architecture, named pseudo-3d residual net (p3d resnet), exploits variants blocks composes different placement resnet, following philosophy enhancing structural diversity going deep could improve power neural networks. p3d resnet achieves clear improvements sports-1m video classification dataset 3d cnn frame-based 2d cnn 5.3% 1.8%, respectively. examine generalization performance video representation produced pre-trained p3d resnet five different benchmarks three different tasks, demonstrating superior performances several state-of-the-art techniques.",4 "kblrn : end-to-end learning knowledge base representations latent, relational, numerical features. present kblrn, framework end-to-end learning knowledge base representations latent, relational, numerical features. kblrn integrates feature types novel combination neural representation learning probabilistic product experts models. best knowledge, kblrn first approach learns representations knowledge bases integrating latent, relational, numerical features. show instances kblrn outperform existing methods range knowledge base completion tasks. contribute novel data sets enriching commonly used knowledge base completion benchmarks numerical features. made data sets available research. also investigate impact numerical features kb completion performance kblrn.",4 "semi-supervised model-based clustering controlled clusters leakage. paper, focus finding clusters partially categorized data sets. propose semi-supervised version gaussian mixture model, called c3l, retrieves natural subgroups given categories. contrast semi-supervised models, c3l parametrized user-defined leakage level, controls maximal inconsistency initial categorization resulting clustering. method implemented module practical expert systems detect clusters, combine expert knowledge true distribution data. moreover, used improving results less flexible clustering techniques, projection pursuit clustering. paper presents extensive theoretical analysis model fast algorithm efficient optimization. experimental results show c3l finds high quality clustering model, applied discovering meaningful groups partially classified data.",4 "aspects evolutionary design computers. paper examines four main types evolutionary design computers: evolutionary design optimisation, evolutionary art, evolutionary artificial life forms creative evolutionary design. definitions four areas provided. review current work areas given, examples types applications tackled. different properties requirements examined. descriptions typical representations evolutionary algorithms provided examples designs evolved using techniques shown. paper discusses boundaries areas beginning merge, resulting four new 'overlapping' types evolutionary design: integral evolutionary design, artificial life based evolutionary design, aesthetic evolutionary al aesthetic evolutionary design. finally, last part paper discusses common problems faced creators evolutionary design systems, including: interdependent elements designs, epistasis, constraint handling.",4 "flow-guided feature aggregation video object detection. extending state-of-the-art object detectors image video challenging. accuracy detection suffers degenerated object appearances videos, e.g., motion blur, video defocus, rare poses, etc. existing work attempts exploit temporal information box level, methods trained end-to-end. present flow-guided feature aggregation, accurate end-to-end learning framework video object detection. leverages temporal coherence feature level instead. improves per-frame features aggregation nearby features along motion paths, thus improves video recognition accuracy. method significantly improves upon strong single-frame baselines imagenet vid, especially challenging fast moving objects. framework principled, par best engineered systems winning imagenet vid challenges 2016, without additional bells-and-whistles. proposed method, together deep feature flow, powered winning entry imagenet vid challenges 2017. code available https://github.com/msracver/flow-guided-feature-aggregation.",4 "information directed sampling stochastic bandits graph feedback. consider stochastic multi-armed bandit problems graph feedback, decision maker allowed observe neighboring actions chosen action. allow graph structure vary time consider deterministic erd\h{o}s-r\'enyi random graph models. graph feedback model, first present novel analysis thompson sampling leads tighter performance bound existing work. next, propose new information directed sampling based policies graph-aware decision making. deterministic graph case, establish bayesian regret bound proposed policies scales clique cover number graph instead number actions. random graph case, provide bayesian regret bound proposed policies scales ratio number actions expected number observations per iteration. best knowledge, first analytical result stochastic bandits random graph feedback. finally, using numerical evaluations, demonstrate proposed ids policies outperform existing approaches, including adaptions upper confidence bound, $\epsilon$-greedy exp3 algorithms.",4 "optical flow-based 3d human motion estimation monocular video. present generative method estimate 3d human motion body shape monocular video. assumption starting initial pose optical flow constrains subsequent human motion, exploit flow find temporally coherent human poses motion sequence. estimate human motion minimizing difference computed flow fields output artificial flow renderer. single initialization step required estimate motion multiple frames. several regularization functions enhance robustness time. test scenarios demonstrate optical flow effectively regularizes under-constrained problem human shape motion estimation monocular video.",4 "learning belief networks domains recursively embedded pseudo independent submodels. pseudo independent (pi) model probabilistic domain model (pdm) proper subsets set collectively dependent variables display marginal independence. pi models cannot learned correctly many algorithms rely single link search. earlier work learning pi models suggested straightforward multi-link search algorithm. however, domain contains recursively embedded pi submodels, may escape detection algorithm. paper, propose improved algorithm ensures learning embedded pi submodels whose sizes upper bounded predetermined parameter. show improved learning capability increases complexity slightly beyond previous algorithm. performance new algorithm demonstrated experiment.",4 "delay-optimal power subcarrier allocation ofdma systems via stochastic approximation. paper, consider delay-optimal power subcarrier allocation design ofdma systems $n_f$ subcarriers, $k$ mobiles one base station. $k$ queues base station downlink traffic $k$ mobiles heterogeneous packet arrivals delay requirements. shall model problem $k$-dimensional infinite horizon average reward markov decision problem (mdp) control actions assumed function instantaneous channel state information (csi) well joint queue state information (qsi). problem challenging corresponds stochastic network utility maximization (num) problem general solution still unknown. propose {\em online stochastic value iteration} solution using {\em stochastic approximation}. proposed power control algorithm, function csi qsi, takes form multi-level water-filling. prove two mild conditions theorem 1 (one stepsize condition. condition accessibility markov chain, easily satisfied cases interested.), proposed solution converges optimal solution almost surely (with probability 1) proposed framework offers possible solution general stochastic num problem. exploiting birth-death structure queue dynamics, obtain reduced complexity decomposed solution linear $\mathcal{o}(kn_f)$ complexity $\mathcal{o}(k)$ memory requirement.",4 "sparse overcomplete word vector representations. current distributed representations words show little resemblance theories lexical semantics. former dense uninterpretable, latter largely based familiar, discrete classes (e.g., supersenses) relations (e.g., synonymy hypernymy). propose methods transform word vectors sparse (and optionally binary) vectors. resulting representations similar interpretable features typically used nlp, though discovered automatically raw corpora. vectors highly sparse, computationally easy work with. importantly, find outperform original vectors benchmark tasks.",4 "steerable graph laplacian application filtering image data-sets. recent years, improvements various scientific image acquisition techniques gave rise need adaptive processing methods aimed large data-sets corrupted noise deformations. work, consider data-sets images sampled underlying low-dimensional manifold (i.e. image-valued manifold), images obtained arbitrary planar rotations. derive mathematical framework processing data-sets, introduce graph laplacian-like operator, termed steerable graph laplacian (sgl), extends standard graph laplacian (gl) accounting (infinitely-many) planar rotations images. turns out, properly normalized sgl converges laplace-beltrami operator low-dimensional manifold, improved convergence rate compared gl. moreover, sgl admits eigenfunctions form fourier modes multiplied eigenvectors certain matrices. image data-sets corrupted noise, employ subset eigenfunctions ""filter"" data-set, essentially using images rotations simultaneously. demonstrate filtering framework de-noising simulated single-particle cryo-em image data-sets.",4 "deep learning identifying radiogenomic associations breast cancer. purpose: determine whether deep learning models distinguish breast cancer molecular subtypes based dynamic contrast-enhanced magnetic resonance imaging (dce-mri). materials methods: institutional review board-approved single-center study, analyzed dce-mr images 270 patients institution. lesions interest identified radiologists. task automatically determine whether tumor luminal subtype another subtype based mr image patches representing tumor. three different deep learning approaches used classify tumor according molecular subtypes: learning scratch tumor patches used training, transfer learning networks pre-trained natural images fine-tuned using tumor patches, off-the-shelf deep features features extracted neural networks trained natural images used classification support vector machine. network architectures utilized experiments googlenet, vgg, cifar. used 10-fold crossvalidation method validation area receiver operating characteristic (auc) measure performance. results: best auc performance distinguishing molecular subtypes 0.65 (95% ci:[0.57,0.71]) achieved off-the-shelf deep features approach. highest auc performance training scratch 0.58 (95% ci:[0.51,0.64]) best auc performance transfer learning 0.60 (95% ci:[0.52,0.65]) respectively. off-the-shelf approach, features extracted fully connected layer performed best. conclusion: deep learning may play role discovering radiogenomic associations breast cancer.",4 "traversing knowledge graphs vector space. path queries knowledge graph used answer compositional questions ""what languages spoken people living lisbon?"". however, knowledge graphs often missing facts (edges) disrupts path queries. recent models knowledge base completion impute missing facts embedding knowledge graphs vector spaces. show models recursively applied answer path queries, suffer cascading errors. motivates new ""compositional"" training objective, dramatically improves models' ability answer path queries, cases doubling accuracy. standard knowledge base completion task, also demonstrate compositional training acts novel form structural regularization, reliably improving performance across base models (reducing errors 43%) achieving new state-of-the-art results.",4 "rapid learning stochastic focus attention. present method stop evaluation decision making process result full evaluation obvious. trait highly desirable online margin-based machine learning algorithms classifier traditionally evaluates features every example. observe examples easier classify others, phenomenon characterized event features agree class example. stopping feature evaluation encountering easy classify example, learning algorithm achieve substantial gains computation. method provides natural attention mechanism learning algorithms. modifying pegasos, margin-based online learning algorithm, include attentive method lower number attributes computed $n$ average $o(\sqrt{n})$ features without loss prediction accuracy. demonstrate effectiveness attentive pegasos mnist data.",4 "novel parser design algorithm based artificial ants. article presents unique design parser using ant colony optimization algorithm. paper implements intuitive thought process human mind activities artificial ants. scheme presented uses bottom-up approach parsing program directly use ambiguous redundant grammars. allocate node corresponding production rule present given grammar. node connected nodes (representing production rules), thereby establishing completely connected graph susceptible movement artificial ants. ant tries modify sentential form production rule present node upgrades position sentential form reduces start symbol s. successful ants deposit pheromone links traversed through. eventually, optimum path discovered links carrying maximum amount pheromone concentration. design simple, versatile, robust effective obviates calculation mentioned sets precedence relation tables. advantages scheme lie i) ascertaining whether given string belongs language represented grammar, ii) finding shortest possible path given string start symbol case multiple routes exist.",4 "towards stability optimality stochastic gradient descent. iterative procedures parameter estimation based stochastic gradient descent allow estimation scale massive data sets. however, theory practice, suffer numerical instability. moreover, statistically inefficient estimators true parameter value. address two issues, propose new iterative procedure termed averaged implicit sgd (ai-sgd). statistical efficiency, ai-sgd employs averaging iterates, achieves optimal cram\'{e}r-rao bound strong convexity, i.e., optimal unbiased estimator true parameter value. numerical stability, ai-sgd employs implicit update iteration, related proximal operators optimization. practice, ai-sgd achieves competitive performance state-of-the-art procedures. furthermore, stable averaging procedures employ proximal updates, simple implement requires fewer tunable hyperparameters procedures employ proximal updates.",19 "collaborating robotics using nature-inspired meta-heuristics. paper introduces collaborating robots provide possibility enhanced task performance, high reliability decreased. collaborating-bots collection mobile robots able self-assemble self-organize order solve problems cannot solved single robot. robots combine power swarm intelligence flexibility self-reconfiguration aggregate collaborating-bots dynamically change structure match environmental variations. collaborating robots networks independent agents, potentially reconfigurable networks communicating agents capable coordinated sensing interaction environment. robots going important part future. collaborating robots limited individual capability, robots deployed large numbers represent strong force similar colony ants swarm bees. present mechanism collaborating robots based swarm intelligence ant colony optimization particle swarm optimization",4 "people media: jointly identifying credible news trustworthy citizen journalists online communities. media seems become partisan, often providing biased coverage news catering interest specific groups. therefore essential identify credible information content provides objective narrative event. news communities digg, reddit, newstrust offer recommendations, reviews, quality ratings, insights journalistic works. however, complex interaction different factors online communities: fairness style reporting, language clarity objectivity, topical perspectives (like political viewpoint), expertise bias community members, more. paper presents model systematically analyze different interactions news community users, news, sources. develop probabilistic graphical model leverages joint interaction identify 1) highly credible news articles, 2) trustworthy news sources, 3) expert users perform role ""citizen journalists"" community. method extends crf models incorporate real-valued ratings, communities fine-grained scales cannot easily discretized without losing information. best knowledge, paper first full-fledged analysis credibility, trust, expertise news communities.",4 "maximal large deviation inequality sub-gaussian variables. short note prove maximal concentration lemma sub-gaussian random variables stating independent sub-gaussian random variables \[p<(\max_{1\le i\le n}s_{i}>\epsilon>) \le\exp<(-\frac{1}{n^2}\sum_{i=1}^{n}\frac{\epsilon^{2}}{2\sigma_{i}^{2}}>), \] $s_i$ sum $i$ zero mean independent sub-gaussian random variables $\sigma_i$ variance $i$th random variable.",4 "local expectation gradients doubly stochastic variational inference. introduce local expectation gradients general purpose stochastic variational inference algorithm constructing stochastic gradients sampling variational distribution. algorithm divides problem estimating stochastic gradients multiple variational parameters smaller sub-tasks sub-task exploits intelligently information coming relevant part variational distribution. achieved performing exact expectation single random variable mostly correlates variational parameter interest resulting rao-blackwellized estimate low variance work efficiently continuous discrete random variables. furthermore, proposed algorithm interesting similarities gibbs sampling time, unlike gibbs sampling, trivially parallelized.",19 "cognitive principles robust multimodal interpretation. multimodal conversational interfaces provide natural means users communicate computer systems multiple modalities speech gesture. build effective multimodal interfaces, automated interpretation user multimodal inputs important. inspired previous investigation cognitive status multimodal human machine interaction, developed greedy algorithm interpreting user referring expressions (i.e., multimodal reference resolution). algorithm incorporates cognitive principles conversational implicature givenness hierarchy applies constraints various sources (e.g., temporal, semantic, contextual) resolve references. empirical results shown advantage algorithm efficiently resolving variety user references. simplicity generality, approach potential improve robustness multimodal input interpretation.",4 "conditional random fields support vector machines: hybrid approach. propose novel hybrid loss multiclass structured prediction problems convex combination log loss conditional random fields (crfs) multiclass hinge loss support vector machines (svms). provide sufficient condition hybrid loss fisher consistent classification. condition depends measure dominance labels - specifically, gap per observation probabilities likely labels. also prove fisher consistency necessary parametric consistency learning models crfs. demonstrate empirically hybrid loss typically performs least well - often better - constituent losses variety tasks. also provide empirical comparison efficacy probabilistic margin based approaches multiclass structured prediction effects label dominance results.",4 "greedy active learning algorithm logistic regression models. study logistic model-based active learning procedure binary classification problems, adopt batch subject selection strategy modified sequential experimental design method. moreover, accompanying proposed subject selection scheme, simultaneously conduct greedy variable selection procedure update classification model labeled training subjects. proposed algorithm repeatedly performs subject variable selection steps prefixed stopping criterion reached. numerical results show proposed procedure competitive performance, smaller training size compact model, comparing classifier trained variables full data set. also apply proposed procedure well-known wave data set (breiman et al., 1984) confirm performance method.",19 "interactively transferring cnn patterns part localization. scenario one/multi-shot learning, conventional end-to-end learning strategies without sufficient supervision usually powerful enough learn correct patterns noisy signals. thus, given cnn pre-trained object classification, paper proposes method first summarizes knowledge hidden inside cnn dictionary latent activation patterns, builds new model part localization manually assembling latent patterns related target part via human interactions. use (e.g., three) annotations semantic object part retrieve certain latent patterns conv-layers represent target part. visualize latent patterns ask users remove incorrect patterns, order refine part representation. guidance human interactions, method exhibited superior performance part localization experiments.",4 "comparative study cnn, bovw lbp classification histopathological images. despite progress made field medical imaging, remains large area open research, especially due variety imaging modalities disease-specific characteristics. paper comparative study describing potential using local binary patterns (lbp), deep features bag-of-visual words (bovw) scheme classification histopathological images. introduce new dataset, \emph{kimia path960}, contains 960 histopathology images belonging 20 different classes (different tissue types). make dataset publicly available. small size dataset inter- intra-class variability makes ideal initial investigations comparing image descriptors search classification complex medical imaging cases like histopathology. investigate deep features, lbp histograms bovw classify images via leave-one-out validation. accuracy image classification obtained using lbp 90.62\% highest accuracy using deep features reached 94.72\%. dictionary approach (bovw) achieved 96.50\%. deep solutions may able deliver higher accuracies need extensive training large number (balanced) image datasets.",4 "algorithms irrelevance-based partial maps. irrelevance-based partial maps useful constructs domain-independent explanation using belief networks. look two definitions partial maps, prove important properties useful designing algorithms computing effectively. make use properties modifying standard map best-first algorithm, handle irrelevance-based partial maps.",4 "principal manifolds nonlinear dimension reduction via local tangent space alignment. nonlinear manifold learning unorganized data points challenging unsupervised learning data visualization problem great variety applications. paper present new algorithm manifold learning nonlinear dimension reduction. based set unorganized data points sampled noise manifold, represent local geometry manifold using tangent spaces learned fitting affine subspace neighborhood data point. tangent spaces aligned give internal global coordinates data points respect underlying manifold way partial eigendecomposition neighborhood connection matrix. present careful error analysis algorithm show reconstruction errors second-order accuracy. illustrate algorithm using curves surfaces 2d/3d higher dimensional euclidean spaces, 64-by-64 pixel face images various pose lighting conditions. also address several theoretical algorithmic issues research improvements.",4 "detection algorithms communication systems using deep learning. design analysis communication systems typically rely development mathematical models describe underlying communication channel, dictates relationship transmitted received signals. however, systems, molecular communication systems chemical signals used transfer information, possible accurately model relationship. scenarios, lack mathematical channel models, completely new approach design analysis required. work, focus one important aspect communication systems, detection algorithms, demonstrate borrowing tools deep learning, possible train detectors perform well, without knowledge underlying channel models. evaluate algorithms using experimental data collected chemical communication platform, channel model unknown difficult model analytically. show deep learning algorithms perform significantly better simple detector used previous works, also assume knowledge channel.",4 low-rank representation manifold curves. machine learning common interpret data point vector euclidean space. however data may actually functional i.e.\ data point function variable time function discretely sampled. naive treatment functional data traditional multivariate data lead poor performance since algorithms ignoring correlation curvature function. paper propose method analyse subspace structure functional data using state art low-rank representation (lrr). experimental evaluation synthetic real data reveals method massively outperforms conventional lrr tasks concerning functional data.,4 "kernel sparse models automated tumor segmentation. paper, propose sparse coding-based approaches segmentation tumor regions mr images. sparse coding data-adapted dictionaries successfully employed several image recovery vision problems. proposed approaches obtain sparse codes pixel brain magnetic resonance images considering intensity values location information. since trivial obtain pixel-wise sparse codes, combining multiple features sparse coding setup straightforward, propose perform sparse coding high-dimensional feature space non-linear similarities effectively modeled. use training data expert-segmented images obtain kernel dictionaries kernel k-lines clustering procedure. test image, sparse codes computed kernel dictionaries, used identify tumor regions. approach completely automated, require user intervention initialize tumor regions test image. furthermore, low complexity segmentation approach based kernel sparse codes, allows user initialize tumor region, also presented. results obtained proposed approaches validated manual segmentation expert radiologist, proposed methods lead accurate tumor identification.",4 "linking image text 2-way nets. linking two data sources basic building block numerous computer vision problems. canonical correlation analysis (cca) achieves utilizing linear optimizer order maximize correlation two views. recent work makes use non-linear models, including deep learning techniques, optimize cca loss feature space. paper, introduce novel, bi-directional neural network architecture task matching vectors two data sources. approach employs two tied neural network channels project two views common, maximally correlated space using euclidean loss. show direct link correlation-based loss euclidean loss, enabling use euclidean loss correlation maximization. overcome common euclidean regression optimization problems, modify well-known techniques problem, including batch normalization dropout. show state art results number computer vision matching tasks including mnist image matching sentence-image matching flickr8k, flickr30k coco datasets.",4 "transforming wikipedia ontology-based information retrieval search engine local experts using third-party taxonomy. wikipedia widely used finding general information wide variety topics. vocation provide local information. example, provides plot, cast, production information given movie, showing times local movie theatre. describe connect local information wikipedia, without altering content. case study present involves finding local scientific experts. using third-party taxonomy, independent wikipedia's category hierarchy, index information connected local experts, present activity reports, re-index wikipedia content using taxonomy. connections wikipedia pages local expert reports stored relational database, accessible public sparql endpoint. wikipedia gadget (or plugin) activated interested user, accesses endpoint wikipedia page accessed. additional tab wikipedia page allows user open list teams local experts associated subject matter wikipedia page. technique, though presented way identify local experts, generic, third party taxonomy, used connect wikipedia non-wikipedia data source.",4 "reinforcement learning based active learning method. paper, new reinforcement learning approach proposed based powerful concept named active learning method (alm) modeling. alm expresses multi-input-single-output system fuzzy combination single-input-singleoutput systems. proposed method actor-critic system similar generalized approximate reasoning based intelligent control (garic) structure adapt alm delayed reinforcement signals. system uses temporal difference (td) learning model behavior useful actions control system. goodness action modeled reward- penalty-plane. ids planes updated according plane. shown system learn predefined fuzzy system without (through random actions).",4 "shape-based defect classification non destructive testing. aim work classify aerospace structure defects detected eddy current non-destructive testing. proposed method based assumption defect bound reaction probe coil impedance test. impedance plane analysis used extract feature vector shape coil impedance complex plane, use geometric parameters. shape recognition tested three different machine-learning based classifiers: decision trees, neural networks naive bayes. performance proposed detection system measured terms accuracy, sensitivity, specificity, precision matthews correlation coefficient. several experiments performed dataset eddy current signal samples aircraft structures. obtained results demonstrate usefulness approach competiveness existing descriptors.",4 "view-invariant template matching using homography constraints. change viewpoint one major factors variation object appearance across different images. thus, view-invariant object recognition challenging important image understanding task. paper, propose method match objects images taken different viewpoints. unlike methods literature, restriction camera orientations internal camera parameters imposed prior knowledge 3d structure object required. prove two cameras take pictures object two different viewing angels, relationship every quadruple points reduces special case homography two equal eigenvalues. based property, formulate problem error function indicates likely two sets 2d points projections set 3d points two different cameras. comprehensive set experiments conducted prove robustness method noise, evaluate performance real-world applications, face object recognition.",4 "harnessing cognitive features sarcasm detection. paper, propose novel mechanism enriching feature vector, task sarcasm detection, cognitive features extracted eye-movement patterns human readers. sarcasm detection challenging research problem, importance nlp applications review summarization, dialog systems sentiment analysis well recognized. sarcasm often traced incongruity becomes apparent full sentence unfolds. presence incongruity- implicit explicit- affects way readers eyes move text. observe difference behaviour eye, reading sarcastic non sarcastic sentences. motivated observation, augment traditional linguistic stylistic features sarcasm detection cognitive features obtained readers eye movement data. perform statistical classification using enhanced feature set obtained. augmented cognitive features improve sarcasm detection 3.7% (in terms f-score), performance best reported system.",4 "automatic quality assessment speech translation using joint asr mt features. paper addresses automatic quality assessment spoken language translation (slt). relatively new task defined formalized sequence labeling problem word slt hypothesis tagged good bad according large feature set. propose several word confidence estimators (wce) based automatic evaluation transcription (asr) quality, translation (mt) quality, (combined asr+mt). research work possible built specific corpus contains 6.7k utterances quintuplet containing: asr output, verbatim transcript, text translation, speech translation post-edition translation built. conclusion multiple experiments using joint asr mt features wce mt features remain influent asr feature bring interesting complementary information. robust quality estimators slt used re-scoring speech translation graphs providing feedback user interactive speech translation computer-assisted speech-to-text scenarios.",4 "object categorization finer levels requires higher spatial frequencies, therefore takes longer. human visual system contains hierarchical sequence modules take part visual perception different levels abstraction, i.e., superordinate, basic, subordinate levels. one important question identify ""entry"" level visual representation commenced process object recognition. long time, believed basic level advantage two others; claim challenged recently. used series psychophysics experiments, based rapid presentation paradigm, well two computational models, bandpass filtered images study processing order categorization levels. experiments, investigated type visual information required categorizing objects level varying spatial frequency bands input image. results psychophysics experiments computational models consistent. indicate different spatial frequency information different effects object categorization level. absence high frequency information, subordinate basic level categorization performed inaccurately, superordinate level performed well. means that, low frequency information sufficient superordinate level, basic subordinate levels. finer levels require high frequency information, appears take longer processed, leading longer reaction times. finally, avoid ceiling effect, evaluated robustness results adding different amounts noise input images repeating experiments. expected, categorization accuracy decreased reaction time increased significantly, trends same.this shows results due ceiling effect.",16 "denoising arterial spin labeling cerebral blood flow images using deep learning. arterial spin labeling perfusion mri noninvasive technique measuring quantitative cerebral blood flow (cbf), measurement subject low signal-to-noise-ratio(snr). various post-processing methods proposed denoise asl mri provide moderate improvement. deep learning (dl) emerging technique learn representative signal data without prior modeling highly complex analytically indescribable. purpose study assess whether record breaking performance dl translated asl mri denoising. used convolutional neural network (cnn) build dl asl denosing model (dl-asl) inherently consider inter-voxel correlations. better guide dl-asl training, incorporated prior knowledge asl mri: structural similarity asl cbf map grey matter probability map. relatively large sample data used train model subsequently applied new set data testing. experimental results showed dl-asl achieved state-of-the-art denoising performance asl mri compared current routine methods terms higher snr, keeping cbf quantification quality shorten acquisition time 75%, automatic partial volume correction.",4 "principled hybrids generative discriminative domain adaptation. propose probabilistic framework domain adaptation blends generative discriminative modeling principled way. framework, generative discriminative models correspond specific choices prior parameters. provides us general way interpolate generative discriminative extremes different choices priors. maximizing marginal conditional log-likelihoods, models derived framework use labeled instances source domain well unlabeled instances source target domains. framework, show popular reconstruction loss autoencoder corresponds upper bound negative marginal log-likelihoods unlabeled instances, marginal distributions given proper kernel density estimations. provides way interpret empirical success autoencoders domain adaptation semi-supervised learning. instantiate framework using neural networks, build concrete model, dauto. empirically, demonstrate effectiveness dauto text, image speech datasets, showing outperforms related competitors domain adaptation possible.",4 "continuous dr-submodular maximization: structure algorithms. dr-submodular continuous functions important objectives wide real-world applications spanning map inference determinantal point processes (dpps), mean-field inference probabilistic submodular models, amongst others. dr-submodularity captures subclass non-convex functions enables exact minimization approximate maximization polynomial time. work study problem maximizing non-monotone dr-submodular continuous functions general down-closed convex constraints. start investigating geometric properties underlie objectives, e.g., strong relation (approximately) stationary points global optimum proved. properties used devise two optimization algorithms provable guarantees. concretely, first devise ""two-phase"" algorithm $1/4$ approximation guarantee. algorithm allows use existing methods finding (approximately) stationary points subroutine, thus, harnessing recent progress non-convex optimization. present non-monotone frank-wolfe variant $1/e$ approximation guarantee sublinear convergence rate. finally, extend approach broader class generalized dr-submodular continuous functions, captures wider spectrum applications. theoretical findings validated synthetic real-world problem instances.",4 comparative study arithmetic constraints integer intervals. propose number approaches implement constraint propagation arithmetic constraints integer intervals. end introduce integer interval arithmetic. approach explained using appropriate proof rules reduce variable domains. compare approaches using set benchmarks.,4 "polarity detection movie reviews hindi language. nowadays peoples actively involved giving comments reviews social networking websites websites like shopping websites, news websites etc. large number people everyday share opinion web, results large number user data collected .users also find trivial task read reviews reached decision. would better reviews classified category user finds easier read. opinion mining sentiment analysis natural language processing task mines information various text forms reviews, news, blogs classify basis polarity positive, negative neutral. but, last years, user content hindi language also increasing rapid rate web. important perform opinion mining hindi language well. paper hindi language opinion mining system proposed. system classifies reviews positive, negative neutral hindi language. negation also handled proposed system. experimental results using reviews movies show effectiveness system",4 """maximizing rigidity"" revisited: convex programming approach generic 3d shape reconstruction multiple perspective views. rigid structure-from-motion (rsfm) non-rigid structure-from-motion (nrsfm) long treated literature separate (different) problems. inspired previous work solved directly 3d scene structure factoring relative camera poses out, revisit principle ""maximizing rigidity"" structure-from-motion literature, develop unified theory applicable rigid non-rigid structure reconstruction rigidity-agnostic way. formulate problems convex semi-definite program, imposing constraints seek apply principle minimizing non-rigidity. results demonstrate efficacy approach, state-of-the-art accuracy various 3d reconstruction problems.",4 "pushing point view: behavioral measures manipulation wikipedia. major source information virtually topic, wikipedia serves important role public dissemination consumption knowledge. result, presents tremendous potential people promulgate points view; efforts may subtle typical vandalism. paper, introduce new behavioral metrics quantify level controversy associated particular user: controversy score (c-score) based amount attention user focuses controversial pages, clustered controversy score (cc-score) also takes account topical clustering. show measures useful identifying people try ""push"" points view, showing good predictors editors get blocked. metrics used triage potential pov pushers. apply idea dataset users requested promotion administrator status easily identify editors significantly changed behavior upon becoming administrators. time, behavior rampant. promoted administrator status tend stable behavior comparable groups prolific editors. suggests adminship process works well, wikipedia community overwhelmed users become administrators promote points view.",4 "hybrid medical image classification using association rule mining decision tree algorithm. main focus image mining proposed method concerned classification brain tumor ct scan brain images. major steps involved system are: pre-processing, feature extraction, association rule mining hybrid classifier. pre-processing step done using median filtering process edge features extracted using canny edge detection technique. two image mining approaches hybrid manner proposed paper. frequent patterns ct scan images generated frequent pattern tree (fp-tree) algorithm mines association rules. decision tree method used classify medical images diagnosis. system enhances classification process accurate. hybrid method improves efficiency proposed method traditional image mining methods. experimental result prediagnosed database brain images showed 97% sensitivity 95% accuracy respectively. physicians make use accurate decision tree classification phase classifying brain images normal, benign malignant effective medical diagnosis.",4 "part-to-whole registration histology mri using shape elements. image registration histology magnetic resonance imaging (mri) challenging task due differences structural content contrast. thick wide specimens cannot processed must cut smaller pieces. dramatically increases complexity problem, since piece individually manually pre-aligned. best knowledge, automatic method reliably locate piece tissue within respective whole mri slice, align without prior information. propose novel automatic approach joint problem multimodal registration histology mri, fraction tissue available histology. approach relies representation images using level lines reach contrast invariance. shape elements obtained via extraction bitangents encoded projective-invariant manner, permits identification common pieces curves two images. evaluated approach human brain histology compared resulting alignments manually annotated ground truths. considering complexity brain folding patterns, preliminary results promising suggest use characteristic meaningful shape elements improved robustness efficiency.",4 "use tensorflow. google's machine learning framework tensorflow open-sourced november 2015 [1] since built growing community around it. tensorflow supposed flexible research purposes also allowing models deployed productively. work aimed towards people experience machine learning considering whether use tensorflow environment. several aspects framework important decision examined, heterogenity, extensibility computation graph. pure python implementation linear classification compared implementation utilizing tensorflow. also contrast tensorflow popular frameworks respect modeling capability, deployment performance give brief description current adaption framework.",4 "possibilistic assumption based truth maintenance system, validation data fusion application. data fusion allows elaboration evaluation situation synthesized low level informations provided different kinds sensors. fusion collected data result fewer higher level informations easily assessed human operator assist effectively decision process. paper present suitability advantages using possibilistic assumption based truth maintenance system (n-atms) data fusion military application. first describe problem, needed knowledge representation formalisms problem solving paradigms. remind reader basic concepts atmss, possibilistic logic 11-atmss. finally detail solution given data fusion problem conclude results comparison non-possibilistic solution.",4 "prepaid postpaid? question. novel methods subscription type prediction mobile phone services. paper investigate behavioural differences mobile phone customers prepaid postpaid subscriptions. study reveals (a) postpaid customers active terms service usage (b) strong structural correlations mobile phone call network connections customers subscription type much frequent customers different subscription types. based observations provide methods detect subscription type customers using information personal call statistics, also egocentric networks simultaneously. key first approach cast classification problem problem graph labelling, solved max-flow min-cut algorithms. experiments show that, using user attributes relationships, proposed graph labelling approach able achieve classification accuracy $\sim 87\%$, outperforms $\sim 7\%$ supervised learning methods using user attributes. second problem aim infer subscription type customers external operators. propose via approximate methods solve problem using node attributes, two-ways indirect inference method based observed homophilic structural correlations. results straightforward applications behavioural prediction personal marketing.",4 "reinforced video captioning entailment rewards. sequence-to-sequence models shown promising improvements temporal task video captioning, optimize word-level cross-entropy loss training. first, using policy gradient mixed-loss methods reinforcement learning, directly optimize sentence-level task-based metrics (as rewards), achieving significant improvements baseline, based automatic metrics human evaluation multiple datasets. next, propose novel entailment-enhanced reward (cident) corrects phrase-matching based metrics (such cider) allow logically-implied partial matches avoid contradictions, achieving significant improvements cider-reward model. overall, cident-reward model achieves new state-of-the-art msr-vtt dataset.",4 "deep reinforcement learning boosted external knowledge. recent improvements deep reinforcement learning allowed solve problems many 2d domains atari games. however, complex 3d environments, numerous learning episodes required may time consuming even impossible especially real-world scenarios. present new architecture combine external knowledge deep reinforcement learning using visual input. key concept system augmenting image input adding environment feature information combining two sources decision. evaluate performances method 3d partially-observable environment microsoft malmo platform. experimental evaluation exhibits higher performance faster learning compared single reinforcement learning model.",4 note kullback-leibler divergence von mises-fisher distribution. present derivation kullback leibler (kl)-divergence (also known relative entropy) von mises fisher (vmf) distribution $d$-dimensions.,19 "weighting scheme pairwise multi-label classifier based fuzzy confusion matrix. work addressed issue applying stochastic classifier local, fuzzy confusion matrix framework multi-label classification. proposed novel solution problem correcting label pairwise ensembles. main step correction procedure compute classifier-specific competence cross-competence measures, estimates error pattern underlying classifier. fusion phase employed two weighting approaches based information theory. classifier weights promote base classifiers susceptible correction based fuzzy confusion matrix. experimental study, proposed approach compared two reference methods. comparison made terms six different quality criteria. conducted experiments reveals proposed approach eliminates one main drawbacks original fcm-based approach i.e. original approach vulnerable imbalanced class/label distribution. more, obtained results shows introduced method achieves satisfying classification quality considered quality criteria. additionally, impact fluctuations data set characteristics reduced.",4 "scalable multi-class gaussian process classification using expectation propagation. paper describes expectation propagation (ep) method multi-class classification gaussian processes scales well large datasets. method estimate log-marginal-likelihood involves sum across data instances. enables efficient training using stochastic gradients mini-batches. type training used, computational cost depend number data instances $n$. furthermore, extra assumptions approximate inference process make memory cost independent $n$. consequence proposed ep method used datasets millions instances. compare empirically method alternative approaches approximate required computations using variational inference. results show performs similar even better techniques, sometimes give significantly worse predictive distributions terms test log-likelihood. besides this, training process proposed approach also seems converge smaller number iterations.",19 "predicting demographics high-resolution geographies geotagged tweets. paper, consider problem predicting demographics geographic units given geotagged tweets composed within units. traditional survey methods offer demographics estimates usually limited terms geographic resolution, geographic boundaries, time intervals. thus, would highly useful develop computational methods complement traditional survey methods offering demographics estimates finer geographic resolutions, flexible geographic boundaries (i.e. confined administrative boundaries), different time intervals. prior work focused predicting demographics health statistics relatively coarse geographic resolutions county-level state-level, introduce approach predict demographics finer geographic resolutions blockgroup-level. task predicting gender race/ethnicity counts blockgroup-level, approach adapted prior work problem achieves average correlation 0.389 (gender) 0.569 (race) held-out test dataset. approach outperforms prior approach average correlation 0.671 (gender) 0.692 (race).",4 "sparse signal subspace decomposition based adaptive over-complete dictionary. paper proposes subspace decomposition method based over-complete dictionary sparse representation, called ""sparse signal subspace decomposition"" (or 3sd) method. method makes use novel criterion based occurrence frequency atoms dictionary data set. criterion, well adapted subspace-decomposition dependent basis set, adequately ects intrinsic characteristic regularity signal. 3sd method combines variance, sparsity component frequency criteria unified framework. takes benefits using over-complete dictionary preserves details subspace decomposition rejects strong noise. 3sd method simple linear retrieval operation. require prior knowledge distributions parameters. applied image denoising, demonstrates high performances preserving fine details suppressing strong noise.",19 "solving multistage influence diagrams using branch-and-bound search. branch-and-bound approach solving influ- ence diagrams previously proposed literature, appears never implemented evaluated - apparently due difficulties computing effective bounds branch-and-bound search. paper, describe efficiently compute effective bounds, develop practical implementa- tion depth-first branch-and-bound search influence diagram evaluation outperforms existing methods solving influence diagrams multiple stages.",4 "training adaptive dialogue policy interactive learning visually grounded word meanings. present multi-modal dialogue system interactive learning perceptually grounded word meanings human tutor. system integrates incremental, semantic parsing/generation framework - dynamic syntax type theory records (ds-ttr) - set visual classifiers learned throughout interaction ground meaning representations produces. use system interaction simulated human tutor study effects different dialogue policies capabilities accuracy learned meanings, learning rates, efforts/costs tutor. show overall performance learning agent affected (1) takes initiative dialogues; (2) ability express/use confidence level visual attributes; (3) ability process elliptical incrementally constructed dialogue turns. ultimately, train adaptive dialogue policy optimises trade-off classifier accuracy tutoring costs.",4 "neural multi-task learning automated assessment. grammatical error detection automated essay scoring two tasks area automated assessment. traditionally tasks treated independently different machine learning models features used task. paper, develop multi-task neural network model jointly optimises tasks, particular show neural automated essay scoring significantly improved. show essay score provides little evidence inform grammatical error detection, essay score highly influenced error detection.",4 "structured pruning deep convolutional neural networks. real time application deep learning algorithms often hindered high computational complexity frequent memory accesses. network pruning promising technique solve problem. however, pruning usually results irregular network connections demand extra representation efforts also fit well parallel computation. introduce structured sparsity various scales convolutional neural networks, channel wise, kernel wise intra kernel strided sparsity. structured sparsity advantageous direct computational resource savings embedded computers, parallel computing environments hardware based systems. decide importance network connections paths, proposed method uses particle filtering approach. importance weight particle assigned computing misclassification rate corresponding connectivity pattern. pruned network re-trained compensate losses due pruning. implementing convolutions matrix products, particularly show intra kernel strided sparsity simple constraint significantly reduce size kernel feature map matrices. pruned network finally fixed point optimized reduced word length precision. results significant reduction total storage size providing advantages on-chip memory based implementations deep neural networks.",4 "fuzzy - rough feature selection π- membership function mammogram classification. breast cancer second leading cause death among women diagnosed help mammograms. oncologists miserably failed identifying micro calcification early stage help mammogram visually. order improve performance breast cancer screening, researchers proposed computer aided diagnosis using image processing. study mammograms preprocessed features extracted, abnormality identified classification. extracted features used, cases misidentified. hence feature selection procedure sought. paper, fuzzy-rough feature selection {\pi} membership function proposed. selected features used classify abnormalities help ant-miner weka tools. experimental analysis shows proposed method improves mammograms classification accuracy.",4 "peduncle detection sweet pepper autonomous crop harvesting - combined colour 3d information. paper presents 3d visual detection method challenging task detecting peduncles sweet peppers (capsicum annuum) field. cutting peduncle cleanly one difficult stages harvesting process, peduncle part crop attaches main stem plant. accurate peduncle detection 3d space therefore vital step reliable autonomous harvesting sweet peppers, lead precise cutting avoiding damage surrounding plant. paper makes use colour geometry information acquired rgb-d sensor utilises supervised-learning approach peduncle detection task. performance proposed method demonstrated evaluated using qualitative quantitative results (the area-under-the-curve (auc) detection precision-recall curve). able achieve auc 0.71 peduncle detection field-grown sweet peppers. release set manually annotated 3d sweet pepper peduncle images assist research community performing research topic.",4 "large-scale video classification guided batch normalized lstm translator. youtube-8m dataset enhances development large-scale video recognition technology imagenet dataset encouraged image classification, recognition detection artificial intelligence fields. large video dataset, challenging task classify huge amount multi-labels. change perspective, propose novel method regarding labels words. details, describe online learning approaches multi-label video classification guided deep recurrent neural networks video sentence translator. designed translator based lstms found stochastic gating input lstm cell help us design structural details. addition, adopted batch normalizations models improve lstm models. since models feature extractors, used classifiers. finally report improved validation results models large-scale youtube-8m datasets discussions improvement.",4 "(not) train generative model: scheduled sampling, likelihood, adversary?. modern applications progress deep learning research created renewed interest generative models text images. however, even today unclear objective functions one use train evaluate models. paper present two contributions. firstly, present critique scheduled sampling, state-of-the-art training method contributed winning entry mscoco image captioning benchmark 2015. show despite impressive empirical performance, objective function underlying scheduled sampling improper leads inconsistent learning algorithm. secondly, revisit problems scheduled sampling meant address, present alternative interpretation. argue maximum likelihood inappropriate training objective end-goal generate natural-looking samples. go derive ideal objective function use situation instead. introduce generalisation adversarial training, show method interpolate maximum likelihood training ideal training objective. knowledge first theoretical analysis explains adversarial training tends produce samples higher perceived quality.",19 "extreme clicking efficient object annotation. manually annotating object bounding boxes central building computer vision datasets, time consuming (annotating ilsvrc [53] took 35s one high-quality box [62]). involves clicking imaginary corners tight box around object. difficult corners often outside actual object several adjustments required obtain tight box. propose extreme clicking instead: ask annotator click four physical points object: top, bottom, left- right-most points. task natural points easy find. crowd-source extreme point annotations pascal voc 2007 2012 show (1) annotation time 7s per box, 5x faster traditional way drawing boxes [62]; (2) quality boxes good original ground-truth drawn traditional way; (3) detectors trained annotations accurate trained original ground-truth. moreover, extreme clicking strategy yields box coordinates, also four accurate boundary points. show (4) incorporate grabcut obtain accurate segmentations delivered initializing bounding boxes; (5) semantic segmentations models trained segmentations outperform trained segmentations derived bounding boxes.",4 "algorithms computing greatest simulations bisimulations fuzzy automata. recently, two types simulations (forward backward simulations) four types bisimulations (forward, backward, forward-backward, backward-forward bisimulations) fuzzy automata introduced. least one simulation/bisimulation types given fuzzy automata, proved greatest simulation/bisimulation kind. present paper, above-mentioned types simulations/bisimulations provide effective algorithm deciding whether simulation/bisimulation type given fuzzy automata, computing greatest one, whenever exists. algorithms based method developed [j. ignjatovi\'c, m. \'ciri\'c, s. bogdanovi\'c, greatest solutions certain systems fuzzy relation inequalities equations, fuzzy sets systems 161 (2010) 3081-3113], comes computing greatest post-fixed point, contained given fuzzy relation, isotone function lattice fuzzy relations.",4 "nonlinear metric learning knn svms geometric transformations. recent years, research efforts extend linear metric learning models handle nonlinear structures attracted great interests. paper, propose novel nonlinear solution utilization deformable geometric models learn spatially varying metrics, apply strategy boost performance knn svm classifiers. thin-plate splines (tps) chosen geometric model due remarkable versatility representation power accounting high-order deformations. transforming input space tps, pull same-class neighbors closer pushing different-class points farther away knn, well make input data points linearly separable svms. improvements performance knn classification demonstrated experiments synthetic real world datasets, comparisons made several state-of-the-art metric learning solutions. svm-based models also achieve significant improvements traditional linear kernel svms datasets.",4 "learning paraphrase: unsupervised approach using multiple-sequence alignment. address text-to-text generation problem sentence-level paraphrasing -- phenomenon distinct difficult word- phrase-level paraphrasing. approach applies multiple-sequence alignment sentences gathered unannotated comparable corpora: learns set paraphrasing patterns represented word lattice pairs automatically determines apply patterns rewrite new sentences. results evaluation experiments show system derives accurate paraphrases, outperforming baseline systems.",4 "commonly uncommon: semantic sparsity situation recognition. semantic sparsity common challenge structured visual classification problems; output space complex, vast majority possible predictions rarely, ever, seen training set. paper studies semantic sparsity situation recognition, task producing structured summaries happening images, including activities, objects roles objects play within activity. problem, find empirically object-role combinations rare, current state-of-the-art models significantly underperform sparse data regime. avoid many errors (1) introducing novel tensor composition function learns share examples across role-noun combinations (2) semantically augmenting training data automatically gathered examples rarely observed outputs using web data. integrated within complete crf-based structured prediction model, tensor-based approach outperforms existing state art relative improvement 2.11% 4.40% top-5 verb noun-role accuracy, respectively. adding 5 million images semantic augmentation techniques gives relative improvements 6.23% 9.57% top-5 verb noun-role accuracy.",4 "deepqa: improving estimation single protein model quality deep belief networks. protein quality assessment (qa) ranking selecting protein models long viewed one major challenges protein tertiary structure prediction. especially, estimating quality single protein model, important selecting good models large model pool consisting mostly low-quality models, still largely unsolved problem. introduce novel single-model quality assessment method deepqa based deep belief network utilizes number selected features describing quality model different perspectives, energy, physio-chemical characteristics, structural information. deep belief network trained several large datasets consisting models critical assessment protein structure prediction (casp) experiments, several publicly available datasets, models generated in-house ab initio method. experiment demonstrate deep belief network better performance compared support vector machines neural networks protein model quality assessment problem, method deepqa achieves state-of-the-art performance casp11 dataset. also outperformed two well-established methods selecting good outlier models large set models mostly low quality generated ab initio modeling methods. deepqa useful tool protein single model quality assessment protein structure prediction. source code, executable, document training/test datasets deepqa linux freely available non-commercial users http://cactus.rnet.missouri.edu/deepqa/.",4 "extending object-oriented languages declarative specifications complex objects using answer-set programming. many applications require complexly structured data objects. developing new adapting existing algorithmic solutions creating objects non-trivial costly task considered objects subject different application-specific constraints. often, however, comparatively easy declaratively describe required objects. paper, propose use answer-set programming (asp)---a well-established declarative programming paradigm area artificial intelligence---for instantiating objects standard object-oriented programming languages. particular, extend java declarative specifications required objects automatically generated using available asp solver technology.",4 "vicious circle principle formation sets asp based languages. paper continues investigation poincare russel's vicious circle principle (vcp) context design logic programming languages sets. expand previously introduced language alog aggregates allowing infinite sets several additional set related constructs useful knowledge representation teaching. addition, propose alternative formalization original vcp incorporate semantics new language, slog+, allows liberal construction sets use programming rules. show that, programs without disjunction infinite sets, formal semantics aggregates slog+ coincides several known languages. intuitive formal semantics, however, based quite different ideas seem involved slog+.",4 "stochastic proximal gradient descent nuclear norm regularization. paper, utilize stochastic optimization reduce space complexity convex composite optimization nuclear norm regularizer, variable matrix size $m \times n$. constructing low-rank estimate gradient, propose iterative algorithm based stochastic proximal gradient descent (spgd), take last iterate spgd final solution. main advantage proposed algorithm space complexity $o(m+n)$, contrast, previous algorithms $o(mn)$ space complexity. theoretical analysis shows achieves $o(\log t/\sqrt{t})$ $o(\log t/t)$ convergence rates general convex functions strongly convex functions, respectively.",4 "last-step regression algorithm non-stationary online learning. goal learner standard online learning maintain average loss close loss best-performing single function class. many real-world problems, rating ranking items, single best target function runtime algorithm, instead best (local) target function drifting time. develop novel last-step minmax optimal algorithm context drift. analyze algorithm worst-case regret framework show maintains average loss close best slowly changing sequence linear functions, long total drift sublinear. situations, bound improves existing bounds, additionally algorithm suffers logarithmic regret drift. also build h_infinity filter bound, develop analyze second algorithm drifting setting. synthetic simulations demonstrate advantages algorithms worst-case constant drift setting.",4 "combat models rts games. game tree search algorithms, monte carlo tree search (mcts), require access forward model (or ""simulator"") game hand. however, games forward model readily available. paper presents three forward models two-player attrition games, call ""combat models"", show used simulate combat rts games. also show combat models learned replay data. use starcraft application domain. report experiments comparing combat models predicting combat output impact used tactical decisions real game.",4 "supervised learning similarity functions. address problem general supervised learning data accessed (indefinite) similarity function data points. existing work learning indefinite kernels concentrated solely binary/multi-class classification problems. propose model generic enough handle supervised learning task also subsumes model previously proposed classification. give ""goodness"" criterion similarity functions w.r.t. given supervised learning task adapt well-known landmarking technique provide efficient algorithms supervised learning using ""good"" similarity functions. demonstrate effectiveness model three important super-vised learning problems: a) real-valued regression, b) ordinal regression c) ranking show method guarantees bounded generalization error. furthermore, case real-valued regression, give natural goodness definition that, used conjunction recent result sparse vector recovery, guarantees sparse predictor bounded generalization error. finally, report results learning algorithms regression ordinal regression tasks using non-psd similarity functions demonstrate effectiveness algorithms, especially sparse landmark selection algorithm achieves significantly higher accuracies baseline methods offering reduced computational costs.",4 "modeling uncertain temporal evolutions model-based diagnosis. although notion diagnostic problem extensively investigated context static systems, practical applications behavior modeled system significantly variable time. goal paper propose novel approach modeling uncertainty temporal evolutions time-varying systems characterization model-based temporal diagnosis. since real world cases knowledge temporal evolution system diagnosed uncertain, consider case probabilistic temporal knowledge available component system choose model means markov chains. fact, aim exploiting statistical assumptions underlying reliability theory context diagnosis timevarying systems. finally show exploit markov chain theory order discard, diagnostic process, unlikely diagnoses.",4 "markov decision processes continuous side information. consider reinforcement learning (rl) setting agent interacts sequence episodic mdps. start episode agent access side-information context determines dynamics mdp episode. setting motivated applications healthcare baseline measurements patient start treatment episode form context may provide information patient might respond treatment decisions. propose algorithms learning contextual markov decision processes (cmdps) assumption unobserved mdp parameters vary smoothly observed context. also give lower upper pac bounds smoothness assumption. lower bound exponential dependence dimension, consider tractable linear setting context used create linear combinations finite set mdps. linear setting, give pac learning algorithm based kwik learning techniques.",19 "new optimal stepsize approximate dynamic programming. approximate dynamic programming (adp) proven wide range applications spanning large-scale transportation problems, health care, revenue management, energy systems. design effective adp algorithms many dimensions, one crucial factor stepsize rule used update value function approximation. many operations research applications computationally intensive, important obtain good results quickly. furthermore, popular stepsize formulas use tunable parameters produce poor results tuned improperly. derive new stepsize rule optimizes prediction error order improve short-term performance adp algorithm. one, relatively insensitive tunable parameter, new rule adapts level noise problem produces faster convergence numerical experiments.",12 "factorization discrete probability distributions. formulate necessary sufficient conditions arbitrary discrete probability distribution factor according undirected graphical model, log-linear model, general exponential models. result generalizes well known hammersley-clifford theorem.",4 "detection unauthorized iot devices using machine learning techniques. security experts demonstrated numerous risks imposed internet things (iot) devices organizations. due widespread adoption devices, diversity, standardization obstacles, inherent mobility, organizations require intelligent mechanism capable automatically detecting suspicious iot devices connected networks. particular, devices included white list trustworthy iot device types (allowed used within organizational premises) detected. research, random forest, supervised machine learning algorithm, applied features extracted network traffic data aim accurately identifying iot device types white list. train evaluate multi-class classifiers, collected manually labeled network traffic data 17 distinct iot devices, representing nine types iot devices. based classification 20 consecutive sessions use majority rule, iot device types white list correctly detected unknown 96% test cases (on average), white listed device types correctly classified actual types 99% cases. iot device types identified quicker others (e.g., sockets thermostats successfully detected within five tcp sessions connecting network). perfect detection unauthorized iot device types achieved upon analyzing 110 consecutive sessions; perfect classification white listed types required 346 consecutive sessions, 110 resulted 99.49% accuracy. experiments demonstrated successful applicability classifiers trained one location tested another. addition, discussion provided regarding resilience machine learning-based iot white listing method adversarial attacks.",4 "data mining prediction human performance capability software-industry. recruitment new personnel one essential business processes affect quality human capital within company. highly essential companies ensure recruitment right talent maintain competitive edge others market. however companies often face problem recruiting new people ongoing projects due lack proper framework defines criteria selection process. paper aim develop framework would allow project manager take right decision selecting new talent correlating performance parameters domain-specific attributes candidates. also, another important motivation behind project check validity selection procedure often followed various big companies public private sectors focus academic scores, gpa/grades students colleges academic backgrounds. test decision produce optimal results industry need change offers holistic approach recruitment new talent software companies. scope work extends beyond domain similar procedure adopted develop recruitment framework fields well. data-mining techniques provide useful information historical projects depending hiring-manager make decisions recruiting high-quality workforce. study aims bridge hiatus developing data-mining framework based ensemble-learning technique refocus criteria personnel selection. results research clearly demonstrated need refocus selection-criteria quality objectives.",4 "combining multiple time series models robust weighted mechanism. improvement time series forecasting accuracy combining multiple models important well dynamic area research. result, various forecasts combination methods developed literature. however, based simple linear ensemble strategies hence ignore possible relationships two participating models. paper, propose robust weighted nonlinear ensemble technique considers individual forecasts different models well correlations among combining. proposed ensemble constructed using three well-known forecasting models tested three real-world time series. comparison made among proposed scheme three widely used linear combination methods, terms obtained forecast errors. comparison shows ensemble scheme provides significantly lower forecast errors individual model well four linear combination methods.",4 ranking sentences extractive summarization reinforcement learning. single document summarization task producing shorter version document preserving principal information content. paper conceptualize extractive summarization sentence ranking task propose novel training algorithm globally optimizes rouge evaluation metric reinforcement learning objective. use algorithm train neural summarization model cnn dailymail datasets demonstrate experimentally outperforms state-of-the-art extractive abstractive systems evaluated automatically humans.,4 "vqs: linking segmentations questions answers supervised attention vqa question-focused semantic segmentation. rich dense human labeled datasets among main enabling factors recent advance vision-language understanding. many seemingly distant annotations (e.g., semantic segmentation visual question answering (vqa)) inherently connected reveal different levels perspectives human understandings visual scenes --- even set images (e.g., coco). popularity coco correlates annotations tasks. explicitly linking may significantly benefit individual tasks unified vision language modeling. present preliminary work linking instance segmentations provided coco questions answers (qas) vqa dataset, name collected links visual questions segmentation answers (vqs). transfer human supervision previously separate tasks, offer effective leverage existing problems, also open door new research problems models. study two applications vqs data paper: supervised attention vqa novel question-focused semantic segmentation task. former, obtain state-of-the-art results vqa real multiple-choice task simply augmenting multilayer perceptrons attention features learned using segmentation-qa links explicit supervision. put latter perspective, study two plausible methods compare oracle method assuming instance segmentations given test stage.",4 "deep transfer learning: new deep learning glitch classification method advanced ligo. exquisite sensitivity advanced ligo detectors enabled detection multiple gravitational wave signals. sophisticated design detectors mitigates effect types noise. however, advanced ligo data streams contaminated numerous artifacts known glitches: non-gaussian noise transients complex morphologies. given high rate occurrence, glitches lead false coincident detections, obscure even mimic gravitational wave signals. therefore, successfully characterizing removing glitches advanced ligo data utmost importance. here, present first application deep transfer learning glitch classification, showing knowledge deep learning algorithms trained real-world object recognition transferred classifying glitches time-series based spectrogram images. using gravity spy dataset, containing hand-labeled, multi-duration spectrograms obtained real ligo data, demonstrate method enables optimal use deep convolutional neural networks classification given small training datasets, significantly reduces time training networks, achieves state-of-the-art accuracy 98.8%, perfect precision-recall 8 22 classes. furthermore, new types glitches classified accurately given labeled examples technique. trained via transfer learning, show convolutional neural networks truncated used excellent feature extractors unsupervised clustering methods identify new classes based morphology, without labeled examples. therefore, provides new framework dynamic glitch classification gravitational wave detectors, expected encounter new types noise undergo gradual improvements attain design sensitivity.",7 "hand keypoint detection single images using multiview bootstrapping. present approach uses multi-camera system train fine-grained detectors keypoints prone occlusion, joints hand. call procedure multiview bootstrapping: first, initial keypoint detector used produce noisy labels multiple views hand. noisy detections triangulated 3d using multiview geometry marked outliers. finally, reprojected triangulations used new labeled training data improve detector. repeat process, generating labeled data iteration. derive result analytically relating minimum number views achieve target true false positive rates given detector. method used train hand keypoint detector single images. resulting keypoint detector runs realtime rgb images accuracy comparable methods use depth sensors. single view detector, triangulated multiple views, enables 3d markerless hand motion capture complex object interactions.",4 "minimalist grammars minimalist categorial grammars, definitions toward inclusion generated languages. stabler proposes implementation chomskyan minimalist program, chomsky 95 minimalist grammars - mg, stabler 97. framework inherits long linguistic tradition. semantic calculus easily added one uses curry-howard isomorphism. minimalist categorial grammars - mcg, based extension lambek calculus, mixed logic, introduced provide theoretically-motivated syntax-semantics interface, amblard 07. article, give full definitions mg algebraic tree descriptions mcg, take first steps towards giving proof inclusion generated languages.",4 "communication-efficient algorithm distributed sparse learning via two-way truncation. propose communicationally computationally efficient algorithm high-dimensional distributed sparse learning. iteration, local machines compute gradient local data master machine solves one shifted $l_1$ regularized minimization problem. communication cost reduced constant times dimension number state-of-the-art algorithm constant times sparsity number via two-way truncation procedure. theoretically, prove estimation error proposed algorithm decreases exponentially matches centralized method mild assumptions. extensive experiments simulated data real data verify proposed algorithm efficient performance comparable centralized method solving high-dimensional sparse learning problems.",19 "sentrna: improving computational rna design incorporating prior human design strategies. designing rna sequences fold specific structures perform desired biological functions emerging field bioengineering broad applications intracellular chemical catalysis cancer therapy via selective gene silencing. effective rna design requires first solving inverse folding problem: given target structure, propose sequence folds structure. although significant progress made developing computational algorithms purpose, current approaches ineffective designing sequences complex targets, limiting utility real-world applications. however, alternative shown significantly higher performance human players online rna design game eterna. many rounds gameplay, players developed collective library ""human"" rules strategies rna design proven effective current computational approaches, especially complex targets. here, present rna design agent, sentrna, consists fully-connected neural network trained using $eternasolves$ dataset, set $1.8 x 10^4$ player-submitted sequences across 724 unique targets. agent first predicts initial sequence target using trained network, refines solution necessary using short adaptive walk utilizing canon standard design moves. approach, observe sentrna learn apply human-like design strategies solve several complex targets previously unsolvable computational approach. thus demonstrate incorporating prior human design strategies computational agent significantly boost performance, suggests new paradigm machine-based rna design.",16 "towards closing energy gap hog cnn features embedded vision. computer vision enables wide range applications robotics/drones, self-driving cars, smart internet things, portable/wearable electronics. many applications, local embedded processing preferred due privacy and/or latency concerns. accordingly, energy-efficient embedded vision hardware delivering real-time robust performance crucial. deep learning gaining popularity several computer vision algorithms, significant energy consumption difference exists compared traditional hand-crafted approaches. paper, provide in-depth analysis computation, energy accuracy trade-offs learned features deep convolutional neural networks (cnn) hand-crafted features histogram oriented gradients (hog). analysis supported measurements two chips implement algorithms. goal understand source energy discrepancy two approaches provide insight potential areas cnns improved eventually approach energy-efficiency hog maintaining outstanding performance accuracy.",4 "do's don'ts cnn-based face verification. research community appears developed consensus methods acquiring annotated data, design training cnns, many questions still remain answered. paper, explore following questions critical face recognition research: (i) train still images expect systems work videos? (ii) deeper datasets better wider datasets? (iii) adding label noise lead improvement performance deep networks? (iv) alignment needed face recognition? address questions training cnns using casia-webface, umdfaces, new video dataset testing youtube- faces, ijb-a disjoint portion umdfaces datasets. new data set, made publicly available, 22,075 videos 3,735,476 human annotated frames extracted them.",4 "click here: human-localized keypoints guidance viewpoint estimation. motivate address human-in-the-loop variant monocular viewpoint estimation task location class one semantic object keypoint available test time. order leverage keypoint information, devise convolutional neural network called click-here cnn (ch-cnn) integrates keypoint information activations layers process image. transforms keypoint information 2d map used weigh features certain parts image heavily. weighted sum spatial features combined global image features provide relevant information prediction layers. train network, collect novel dataset 3d keypoint annotations thousands cad models, synthetically render millions images 2d keypoint information. test instances pascal 3d+, model achieves mean class accuracy 90.7%, whereas state-of-the-art baseline obtains 85.7% mean class accuracy, justifying argument human-in-the-loop inference.",4 "regularized richardson-lucy algorithm sparse reconstruction poissonian images. restoration digital images degraded measurements always problem great theoretical practical importance numerous applications imaging sciences. specific solution problem image restoration generally determined nature degradation phenomenon well statistical properties measurement noises. present study concerned case images interest corrupted convolutional blurs poisson noises. deal problems, exists range solution methods based principles originating fixed-point algorithm richardson lucy (rl). paper, provide conceptual experimental proof methods tend converge sparse solutions, makes applicable images represented relatively small number non-zero samples spatial domain. unfortunately, set images relatively small, restricts applicability rl-type methods. hand, virtually practical images admit sparse representations domain properly designed linear transform. take advantage fact, therefore tempting modify rl algorithm make recover representation coefficients, rather values associated image. modification introduced paper. apart generality assumptions, proposed method also superior many established reconstruction approaches terms estimation accuracy computational complexity. conclusions study validated series numerical experiments.",4 "gender identity lexical variation social media. present study relationship gender, linguistic style, social networks, using novel corpus 14,000 twitter users. prior quantitative work gender often treats social variable female/male binary; argue nuanced approach. clustering twitter users, find natural decomposition dataset various styles topical interests. many clusters strong gender orientations, use linguistic resources sometimes directly conflicts population-level language statistics. view clusters accurate reflection multifaceted nature gendered language styles. previous corpus-based work also little say individuals whose linguistic styles defy population-level gender patterns. identify individuals, train statistical classifier, measure classifier confidence individual dataset. examining individuals whose language match classifier's model gender, find social networks include significantly fewer same-gender social connections that, general, social network homophily correlated use same-gender language markers. pairing computational methods social theory thus offers new perspective gender emerges individuals position relative audiences, topics, mainstream gender norms.",4 "deep generative deconvolutional image model. deep generative model developed representation analysis images, based hierarchical convolutional dictionary-learning framework. stochastic {\em unpooling} employed link consecutive layers model, yielding top-down image generation. bayesian support vector machine linked top-layer features, yielding max-margin discrimination. deep deconvolutional inference employed testing, infer latent features, top-layer features connected max-margin classifier discrimination tasks. model efficiently trained using monte carlo expectation-maximization (mcem) algorithm, implementation graphical processor units (gpus) efficient large-scale learning, fast testing. excellent results obtained several benchmark datasets, including imagenet, demonstrating proposed model achieves results highly competitive similarly sized convolutional neural networks.",4 "smoothing stochastic gradient method composite optimization. consider unconstrained optimization problem whose objective function composed smooth non-smooth conponents smooth component expectation random function. type problem arises interesting applications machine learning. propose stochastic gradient descent algorithm class optimization problem. non-smooth component particular structure, propose another stochastic gradient descent algorithm incorporating smoothing method first algorithm. proofs convergence rates two algorithms given show numerical performance algorithm applying regularized linear regression problems different sets synthetic data.",12 "skin lesion segmentation: u-nets versus clustering. many automatic skin lesion diagnosis systems use segmentation preprocessing step diagnose skin conditions skin lesion shape, border irregularity, size influence likelihood malignancy. paper presents, examines compares two different approaches skin lesion segmentation. first approach uses u-nets introduces histogram equalization based preprocessing step. second approach c-means clustering based approach much simpler implement faster execute. jaccard index algorithm output hand segmented images dermatologists used evaluate proposed algorithms. many recently proposed deep neural networks segment skin lesions require significant amount computational power training (i.e., computer gpus), main objective paper present methods used cpu. severely limits, example, number training instances presented u-net. comparing two proposed algorithms, u-nets achieved significantly higher jaccard index compared clustering approach. moreover, using histogram equalization preprocessing step significantly improved u-net segmentation results.",4 "pixel deconvolutional networks. deconvolutional layers widely used variety deep models up-sampling, including encoder-decoder networks semantic segmentation deep generative models unsupervised learning. one key limitations deconvolutional operations result so-called checkerboard problem. caused fact direct relationship exists among adjacent pixels output feature map. address problem, propose pixel deconvolutional layer (pixeldcl) establish direct relationships among adjacent pixels up-sampled feature map. method based fresh interpretation regular deconvolution operation. resulting pixeldcl used replace deconvolutional layer plug-and-play manner without compromising fully trainable capabilities original models. proposed pixeldcl may result slight decrease efficiency, overcome implementation trick. experimental results semantic segmentation demonstrate pixeldcl consider spatial features edges shapes yields accurate segmentation outputs deconvolutional layers. used image generation tasks, pixeldcl largely overcome checkerboard problem suffered regular deconvolution operations.",4 "neural network assembly memory model based optimal binary signal detection theory. ternary/binary data coding algorithm conditions hopfield networks implement optimal convolutional hamming decoding algorithms described. using coding/decoding approach (an optimal binary signal detection theory, bsdt) introduced neural network assembly memory model (nnamm) built. model provides optimal (the best) basic memory performance demands use new memory unit architecture two-layer hopfield network, n-channel time gate, auxiliary reference memory, two nested feedback loops. nnamm explicitly describes dependence time memory trace retrieval, gives possibility metamemory simulation, generalized knowledge representation, distinct description conscious unconscious mental processes. model smallest inseparable part ""atom"" consciousness also defined. nnamm's neurobiological backgrounds applications solving interdisciplinary problems shortly discussed. bsdt could implement ""best neural code"" used nervous tissues animals humans.",4 "integrating topic models latent factors recommendation. research personalized recommendation techniques today mostly parted two mainstream directions, i.e., factorization-based approaches topic models. practically, aim benefit numerical ratings textual reviews, correspondingly, compose two major information sources various real-world systems. however, although two approaches supposed correlated goal accurate recommendation, still lacks clear theoretical understanding objective functions mathematically bridged leverage numerical ratings textual reviews collectively, bridge intuitively reasonable match learning procedures rating prediction top-n recommendation tasks, respectively. work, exposit mathematical analysis that, vector-level randomization functions coordinate optimization objectives factorizational topic models unfortunately exist all, although usually pre-assumed intuitively designed literature. fortunately, also point one avoid seeking randomization function optimizing joint factorizational topic (jft) model directly. apply jft model restaurant recommendation, study performance normal cross-city recommendation scenarios, latter extremely difficult task inherent cold-start nature. experimental results real-world datasets verified appealing performance approach previous methods, rating prediction top-n recommendation tasks.",4 "proficiency comparison ladtree reptree classifiers credit risk forecast. predicting credit defaulter perilous task financial industries like banks. ascertaining non-payer giving loan significant conflict-ridden task banker. classification techniques better choice predictive analysis like finding claimant, whether he/she unpretentious customer cheat. defining outstanding classifier risky assignment industrialist like banker. allow computer science researchers drill efficient research works evaluating different classifiers finding best classifier predictive problems. research work investigates productivity ladtree classifier reptree classifier credit risk prediction compares fitness various measures. german credit dataset taken used predict credit risk help open source machine learning tool.",4 "quantified multimodal logics simple type theory. present straightforward embedding quantified multimodal logic simple type theory prove soundness completeness. modal operators replaced quantification type possible worlds. present simple experiments, using existing higher-order theorem provers, demonstrate embedding allows automated proofs statements logics, well meta properties them.",4 flip-flop sublinear models graphs: proof theorem 1. prove class-dual almost sublinear models graphs.,4 "stability video detection tracking. paper, study important yet less explored aspect video detection tracking -- stability. surprisingly, prior work tried study it. result, start work proposing novel evaluation metric video detection considers stability accuracy. accuracy, extend existing accuracy metric mean average precision (map). stability, decompose three terms: fragment error, center position error, scale ratio error. error represents one aspect stability. furthermore, demonstrate stability metric low correlation accuracy metric. thus, indeed captures different perspective quality. lastly, based metric, evaluate several existing methods video detection show affect accuracy stability. believe work provide guidance solid baselines future researches related areas.",4 "systematic analysis state-of-the-art 3d lung nodule proposals generation. lung nodule proposals generation primary step lung nodule detection received much attention recent years . paper, first construct model 3-dimension convolutional neural network (3d cnn) generate lung nodule proposals, achieve state-of-the-art performance. then, analyze series key problems concerning training performance efficiency. firstly, train 3d cnn model data different resolutions find models trained high resolution input data achieve better lung nodule proposals generation performances especially nodules small sizes, consumes much memory time. then, analyze memory consumptions different platforms experimental results indicate cpu architecture provide us larger memory enables us explore possibilities 3d applications. implement 3d cnn model cpu platform propose intel extended-caffe framework supports many highly-efficient 3d computations, opened source https://github.com/extendedcaffe/extended-caffe.",4 "improving statistical machine translation resource-poor language using related resource-rich languages. propose novel language-independent approach improving machine translation resource-poor languages exploiting similarity resource-rich ones. precisely, improve translation resource-poor source language x_1 resource-rich language given bi-text containing limited number parallel sentences x_1-y larger bi-text x_2-y resource-rich language x_2 closely related x_1. achieved taking advantage opportunities vocabulary overlap similarities languages x_1 x_2 spelling, word order, syntax offer: (1) improve word alignments resource-poor language, (2) augment additional translation options, (3) take care potential spelling differences appropriate transliteration. evaluation indonesian- >english using malay spanish -> english using portuguese pretending spanish resource-poor shows absolute gain 1.35 3.37 bleu points, respectively, improvement best rivaling approaches, using much less additional data. overall, method cuts amount necessary ""real training data factor 2--5.",4 "thoracic disease identification localization limited supervision. accurate identification localization abnormalities radiology images play integral part clinical diagnosis treatment planning. building highly accurate prediction model tasks usually requires large number images manually annotated labels finding sites abnormalities. reality, however, annotated data expensive acquire, especially ones location annotations. need methods work well small amount location annotations. address challenge, present unified approach simultaneously performs disease identification localization underlying model images. demonstrate approach effectively leverage class information well limited location annotation, significantly outperforms comparative reference baseline classification localization tasks.",4 "union intersections (uoi) interpretable data driven discovery prediction. increasing size complexity scientific data could dramatically enhance discovery prediction basic scientific applications. realizing potential, however, requires novel statistical analysis methods interpretable predictive. introduce union intersections (uoi), flexible, modular, scalable framework enhanced model selection estimation. methods based uoi perform model selection model estimation intersection union operations, respectively. show uoi-based methods achieve low-variance nearly unbiased estimation small number interpretable features, maintaining high-quality prediction accuracy. perform extensive numerical investigation evaluate uoi algorithm ($uoi_{lasso}$) synthetic real data. so, demonstrate extraction interpretable functional networks human electrophysiology recordings well accurate prediction phenotypes genotype-phenotype data reduced features. also show (with $uoi_{l1logistic}$ $uoi_{cur}$ variants basic framework) improved prediction parsimony classification matrix factorization several benchmark biomedical data sets. results suggest methods based uoi framework could improve interpretation prediction data-driven discovery across scientific fields.",19 "learning attend deep architectures image tracking. discuss attentional model simultaneous object tracking recognition driven gaze data. motivated theories perception, model consists two interacting pathways: identity control, intended mirror pathways neuroscience models. identity pathway models object appearance performs classification using deep (factored)-restricted boltzmann machines. point time observations consist foveated images, decaying resolution toward periphery gaze. control pathway models location, orientation, scale speed attended object. posterior distribution states estimated particle filtering. deeper control pathway, encounter attentional mechanism learns select gazes minimize tracking uncertainty. unlike previous work, introduce gaze selection strategies operate presence partial information continuous action space. show straightforward extension existing approach partial information setting results poor performance, propose alternative method based modeling reward surface gaussian process. approach gives good performance presence partial information allows us expand action space small, discrete set fixation points continuous domain.",4 "learning explain non-standard english words phrases. describe data-driven approach automatically explaining new, non-standard english expressions given sentence, building large dataset includes 15 years crowdsourced examples urbandictionary.com. unlike prior studies focus matching keywords slang dictionary, investigate possibility learning neural sequence-to-sequence model generates explanations unseen non-standard english expressions given context. propose dual encoder approach---a word-level encoder learns representation context, second character-level encoder learn hidden representation target non-standard expression. model produce reasonable definitions new non-standard english expressions given context certain confidence.",4 "interactive visual data exploration subjective feedback: information-theoretic approach. visual exploration high-dimensional real-valued datasets fundamental task exploratory data analysis (eda). existing methods use predefined criteria choose representation data. lack methods (i) elicit user learned data (ii) show patterns know yet. construct theoretical model identified patterns input knowledge system. knowledge syntax intuitive, ""this set points forms cluster"", requires knowledge maths. background knowledge used find maximum entropy distribution data, system provides user data projections data maximum entropy distribution differ most, hence showing user aspects data maximally informative given user's current knowledge. provide open source eda system tailored interactive visualizations demonstrate concepts. study performance system present use cases synthetic real data. find model prototype system allow user learn information efficiently various data sources system works sufficiently fast practice. conclude information theoretic approach exploratory data analysis patterns observed user formalized constraints provides principled, intuitive, efficient basis constructing eda system.",19 "universal consistency minimax rates online mondrian forests. establish consistency algorithm mondrian forests, randomized classification algorithm implemented online. first, amend original mondrian forest algorithm, considers fixed lifetime parameter. indeed, fact parameter fixed hinders statistical consistency original procedure. modified mondrian forest algorithm grows trees increasing lifetime parameters $\lambda_n$, uses alternative updating rule, allowing work also online fashion. second, provide theoretical analysis establishing simple conditions consistency. theoretical analysis also exhibits surprising fact: algorithm achieves minimax rate (optimal rate) estimation lipschitz regression function, strong extension previous results arbitrary dimension.",19 "iterative school decomposition algorithm solving multi-school bus routing scheduling problem. servicing school transportation demand safely minimum number buses one highest financial goals school transportation directors. achieve objective, good efficient way solve routing scheduling problem required. due growth computing power, spotlight shed solving combined problem school bus routing scheduling. recent attempt tried model routing problem maximizing trip compatibilities hope requiring fewer buses scheduling problem. however, over-counting problem associated trip compatibility could diminish performance approach. extended model proposed paper resolve issue along iterative solution algorithm. extended model integrated model multi-school bus routing scheduling problem. result shows better solutions 8 test problems found fewer number buses (up 25%) shorter travel time (up 7% per trip).",12 "nonparametric metadata dependent relational model. introduce nonparametric metadata dependent relational (nmdr) model, bayesian nonparametric stochastic block model network data. nmdr allows entities associated node mixed membership unbounded collection latent communities. learned regression models allow memberships depend on, predicted from, arbitrary node metadata. develop efficient mcmc algorithms learning nmdr models partially observed node relationships. retrospective mcmc methods allow sampler work directly infinite stick-breaking representation nmdr, avoiding need finite truncations. results demonstrate recovery useful latent communities real-world social ecological networks, usefulness metadata link prediction tasks.",4 "unsupervised learning object landmarks factorized spatial embeddings. learning automatically structure object categories remains important open problem computer vision. paper, propose novel unsupervised approach discover learn landmarks object categories, thus characterizing structure. approach based factorizing image deformations, induced viewpoint change object deformation, learning deep neural network detects landmarks consistently visual effects. furthermore, show learned landmarks establish meaningful correspondences different object instances category without impose requirement explicitly. assess method qualitatively variety object types, natural man-made. also show unsupervised landmarks highly predictive manually-annotated landmarks face benchmark datasets, used regress high degree accuracy.",4 "randomized nonmonotone block proximal gradient method class structured nonlinear programming. propose randomized nonmonotone block proximal gradient (rnbpg) method minimizing sum smooth (possibly nonconvex) function block-separable (possibly nonconvex nonsmooth) function. iteration, method randomly picks block according prescribed probability distribution solves typically several associated proximal subproblems usually closed-form solution, certain progress objective value achieved. contrast usual randomized block coordinate descent method [23,20], method nonmonotone flavor uses variable stepsizes partially utilize local curvature information smooth component objective function. show accumulation point solution sequence method stationary point problem {\it almost surely} method capable finding approximate stationary point high probability. also establish sublinear rate convergence method terms minimal expected squared norm certain proximal gradients iterations. problem consideration convex, show expected objective values generated rnbpg converge optimal value problem. assumptions, establish sublinear linear rate convergence expected objective values generated monotone version rnbpg. finally, conduct preliminary experiments test performance rnbpg $\ell_1$-regularized least-squares problem dual svm problem machine learning. computational results demonstrate method substantially outperforms randomized block coordinate {\it descent} method fixed variable stepsizes.",12 "applying evolutionary optimisation robot obstacle avoidance. paper presents artificial evolutionbased method stereo image analysis application real-time obstacle detection avoidance mobile robot. uses parisian approach, consists splitting representation robot's environment large number simple primitives, ""flies"", evolved following biologically inspired scheme give fast, low-cost solution obstacle detection problem mobile robotics.",4 "hebbian/anti-hebbian network derived online non-negative matrix factorization cluster discover sparse features. despite extensive knowledge biophysical properties neurons, commonly accepted algorithmic theory neuronal function. explore hypothesis single-layer neuronal networks perform online symmetric nonnegative matrix factorization (snmf) similarity matrix streamed data. starting snmf cost function derive online algorithm, implemented biologically plausible network local learning rules. demonstrate network performs soft clustering data well sparse feature discovery. derived algorithm replicates many known aspects sensory anatomy biophysical properties neurons including unipolar nature neuronal activity synaptic weights, local synaptic plasticity rules dependence learning rate cumulative neuronal activity. thus, make step towards algorithmic theory neuronal function, facilitate large-scale neural circuit simulations biologically inspired artificial intelligence.",16 "sequential changepoint approach online community detection. present new algorithms detecting emergence community large networks sequential observations. networks modeled using erdos-renyi random graphs edges forming nodes community higher probability. based statistical changepoint detection methodology, develop three algorithms: exhaustive search (es), mixture, hierarchical mixture (h-mix) methods. performance methods evaluated average run length (arl), captures frequency false alarms, detection delay. numerical comparisons show es method performs best; however, exponentially complex. mixture method polynomially complex exploiting fact size community typically small large network. however, may react group active edges form community. issue resolved h-mix method, based dendrogram decomposition network. present asymptotic analytical expression arl mixture method threshold large. numerical simulation verifies approximation accurate even non-asymptotic regime. hence, used determine desired threshold efficiently. finally, numerical examples show mixture h-mix methods detect community quickly lower complexity es method.",19 "query-focused opinion summarization user-generated content. present submodular function-based framework query-focused opinion summarization. within framework, relevance ordering produced statistical ranker, information coverage respect topic distribution diverse viewpoints encoded submodular functions. dispersion functions utilized minimize redundancy. first evaluate different metrics text similarity submodularity-based summarization methods. experimenting community qa blog summarization, show system outperforms state-of-the-art approaches automatic evaluation human evaluation. human evaluation task conducted amazon mechanical turk scale, shows systems able generate summaries high overall quality information diversity.",4 "creating capsule wardrobes fashion images. propose automatically create capsule wardrobes. given inventory candidate garments accessories, algorithm must assemble minimal set items provides maximal mix-and-match outfits. pose task subset selection problem. permit efficient subset selection space outfit combinations, develop submodular objective functions capturing key ingredients visual compatibility, versatility, user-specific preference. since adding garments capsule expands possible outfits, devise iterative approach allow near-optimal submodular function maximization. finally, present unsupervised approach learn visual compatibility ""in wild"" full body outfit photos; compatibility metric translates well cleaner catalog photos improves existing methods. results thousands pieces popular fashion websites show automatic capsule creation potential mimic skilled fashionistas assembling flexible wardrobes, significantly scalable.",4 "learning $\ell^{0}$-graph: $\ell^{0}$-induced sparse subspace clustering. sparse subspace clustering methods, sparse subspace clustering (ssc) \cite{elhamifarv13} $\ell^{1}$-graph \cite{yanw09,chengyyfh10}, effective partitioning data lie union subspaces. methods use $\ell^{1}$-norm $\ell^{2}$-norm thresholding impose sparsity constructed sparse similarity graph, certain assumptions, e.g. independence disjointness, subspaces required obtain subspace-sparse representation, key success. assumptions guaranteed hold practice limit application sparse subspace clustering subspaces general location. paper, propose new sparse subspace clustering method named $\ell^{0}$-graph. contrast required assumptions subspaces existing sparse subspace clustering methods, proved subspace-sparse representation obtained $\ell^{0}$-graph arbitrary distinct underlying subspaces almost surely mild i.i.d. assumption data generation. develop proximal method obtain sub-optimal solution optimization problem $\ell^{0}$-graph proved guarantee convergence. moreover, propose regularized $\ell^{0}$-graph encourages nearby data similar neighbors similarity graph aligned within cluster graph connectivity issue alleviated. extensive experimental results various data sets demonstrate superiority $\ell^{0}$-graph compared competing clustering methods, well effectiveness regularized $\ell^{0}$-graph.",4 "robust video object tracking using particle filter likelihood based feature fusion adaptive template updating. robust algorithm solution proposed tracking object complex video scenes. solution, bootstrap particle filter (pf) initialized object detector, models time-evolving background video signal adaptive gaussian mixture. motion object expressed markov model, defines state transition prior. color texture features used represent object, marginal likelihood based feature fusion approach proposed. corresponding object template model updating procedure developed account possible scale changes object tracking process. experimental results show algorithm beats several existing alternatives tackling challenging scenarios video tracking tasks.",4 "exploring coevolution predator prey morphology behavior. common idiom biology education states, ""eyes front, animal hunts. eyes side, animal hides."" paper, explore one possible explanation predators tend forward-facing, high-acuity visual systems. using agent-based computational model evolution, predators prey interact adapt behavior morphology one another successive generations evolution. model, observe coevolutionary cycle prey swarming behavior predator's visual system, predator prey continually adapt visual system behavior, respectively, evolutionary time reaction one another due well-known ""predator confusion effect."" furthermore, provide evidence predator visual system drives coevolutionary cycle, suggest cycle could closed predator evolves hybrid visual system capable narrow, high-acuity vision tracking prey well broad, coarse vision prey discovery. thus, conflicting demands imposed predator's visual system predator confusion effect could led evolution complex eyes many predators.",16 "novel frank-wolfe algorithm. analysis applications large-scale svm training. recently, renewed interest machine learning community variants sparse greedy approximation procedure concave optimization known {the frank-wolfe (fw) method}. particular, procedure successfully applied train large-scale instances non-linear support vector machines (svms). specializing fw svm training allowed obtain efficient algorithms also important theoretical results, including convergence analysis training algorithms new characterizations model sparsity. paper, present analyze novel variant fw method based new way perform away steps, classic strategy used accelerate convergence basic fw procedure. formulation analysis focused general concave maximization problem simplex. however, specialization algorithm quadratic forms strongly related classic methods computational geometry, namely gilbert mdm algorithms. theoretical side, demonstrate method matches guarantees terms convergence rate number iterations obtained using classic away steps. particular, method enjoys linear rate convergence, result recently proved mdm quadratic forms. practical side, provide experiments several classification datasets, evaluate results using statistical tests. experiments show method faster fw method classic away steps, works well even cases classic away steps slow algorithm. furthermore, improvements obtained without sacrificing predictive accuracy obtained svm model.",4 "discrete geodesic calculus space viscous fluidic objects. based local approximation riemannian distance manifold computationally cheap dissimilarity measure, time discrete geodesic calculus developed, applications shape space explored. dissimilarity measure derived deformation energy whose hessian reproduces underlying riemannian metric, used define length energy discrete paths shape space. notion discrete geodesics defined energy minimizing paths gives rise discrete logarithmic map, variational definition discrete exponential map, time discrete parallel transport. new concept applied shape space shapes considered boundary contours physical objects consisting viscous material. flexibility computational efficiency approach demonstrated topology preserving shape morphing, representation paths shape space via local shape variations path generators, shape extrapolation via discrete geodesic flow, transfer geometric features.",12 linear learning sparse data. linear predictors especially useful data high-dimensional sparse. one standard techniques used train linear predictor averaged stochastic gradient descent (asgd) algorithm. present efficient implementation asgd avoids dense vector operations. also describe translation invariant extension called centered averaged stochastic gradient descent (casgd).,4 "generalized additive model selection. introduce gamsel (generalized additive model selection), penalized likelihood approach fitting sparse generalized additive models high dimension. method interpolates null, linear additive models allowing effect variable estimated either zero, linear, low-complexity curve, determined data. present blockwise coordinate descent procedure efficiently optimizing penalized likelihood objective dense grid tuning parameter, producing regularization path additive models. demonstrate performance method real simulated data examples, compare existing techniques additive model selection.",19 "reverse hex solver. present solrex,an automated solver game reverse hex.reverse hex, also known rex, misere hex, variant game hex player joins two sides loses game. solrex performs mini-max search state space using scalable parallel depth first proof number search, enhanced pruning inferior moves early detection certain winning strategies. solrex implemented code base hex program solver, solve arbitrary positions board sizes 6x6, hardest position taking less four hours four threads.",4 memory capacity random neural network. paper considers problem information capacity random neural network. network represented matrices square symmetrical. matrices weight determines highest lowest possible value found matrix. examined matrices randomly generated analyzed computer program. find surprising result capacity network maximum binary random neural network change number quantization levels associated weights increases.,4 "support vector machine classification indefinite kernels. propose method support vector machine classification using indefinite kernels. instead directly minimizing stabilizing nonconvex loss function, algorithm simultaneously computes support vectors proxy kernel matrix used forming loss. interpreted penalized kernel learning problem indefinite kernel matrices treated noisy observations true mercer kernel. formulation keeps problem convex relatively large problems solved efficiently using projected gradient analytic center cutting plane methods. compare performance technique methods several classic data sets.",4 "reinforcement imitation learning via interactive no-regret learning. recent work demonstrated problems-- particularly imitation learning structured prediction-- learner's predictions influence input-distribution tested naturally addressed interactive approach analyzed using no-regret online learning. approaches imitation learning, however, neither require benefit information cost actions. extend existing results two directions: first, develop interactive imitation learning approach leverages cost information; second, extend technique address reinforcement learning. results provide theoretical support commonly observed successes online approximate policy iteration. approach suggests broad new family algorithms provides unifying view existing techniques imitation reinforcement learning.",4 "hybrid genetic algorithm cloud computing applications. paper aid genetic algorithm fuzzy theory, present hybrid job scheduling approach, considers load balancing system reduces total execution time execution cost. try modify standard genetic algorithm reduce iteration creating population aid fuzzy theory. main goal research assign jobs resources considering vm mips length jobs. new algorithm assigns jobs resources considering job length resources capacities. evaluate performance approach famous cloud scheduling models. results experiments show efficiency proposed approach term execution time, execution cost average degree imbalance (di).",4 "symbolic approach reasoning linguistic quantifiers. paper investigates possibility performing automated reasoning probabilistic logic probabilities expressed means linguistic quantifiers. linguistic term expressed prescribed interval proportions. instead propagating numbers, qualitative terms propagated accordance numerical interpretation terms. quantified syllogism, modelling chaining probabilistic rules, studied context. shown qualitative counterpart syllogism makes sense, relatively independent threshold defining linguistically meaningful intervals, provided threshold values remain accordance intuition. inference power less full-fledged probabilistic con-quaint propagation device better corresponds could thought commonsense probabilistic reasoning.",4 "bayesian additive adaptive basis tensor product models modeling high dimensional surfaces: application high-throughput toxicity testing. many modern data sets sampled error complex high-dimensional surfaces. methods tensor product splines gaussian processes effective/well suited characterizing surface two three dimensions may suffer difficulties representing higher dimensional surfaces. motivated high throughput toxicity testing observed dose-response curves cross sections surface defined chemical's structural properties, model developed characterize surface predict untested chemicals' dose-responses. manuscript proposes novel approach models multidimensional surface sum learned basis functions formed tensor product lower dimensional functions, representable basis expansion learned data. model described, gibbs sampling algorithm proposed, investigated simulation study well data taken us epa's toxcast high throughput toxicity testing platform.",19 "long-range fractal correlations literary corpora. paper analyse fractal structure long human-language records mapping large samples texts onto time series. particular mapping set work inspired linguistic basis sense retains {\em word} fundamental unit communication. results confirm beyond short-range correlations resulting syntactic rules acting sentence level, long-range structures emerge large written language samples give rise long-range correlations use words.",3 "structured low-rank matrix factorization: global optimality, algorithms, applications. recently, convex formulations low-rank matrix factorization problems received considerable attention machine learning. however, formulations often require solving matrix size data matrix, making challenging apply large scale datasets. moreover, many applications data display structures beyond simply low-rank, e.g., images videos present complex spatio-temporal structures largely ignored standard low-rank methods. paper study matrix factorization technique suitable large datasets captures additional structure factors using particular form regularization includes well-known regularizers total variation nuclear norm particular cases. although resulting optimization problem non-convex, show size factors large enough, certain conditions, local minimizer factors yields global minimizer. practical algorithms also provided solve matrix factorization problem, bounds distance given approximate solution optimization problem global optimum derived. examples neural calcium imaging video segmentation hyperspectral compressed recovery show advantages approach high-dimensional datasets.",4 "multi-scale multi-band densenets audio source separation. paper deals problem audio source separation. handle complex ill-posed nature problems audio source separation, current state-of-the-art approaches employ deep neural networks obtain instrumental spectra mixture. study, propose novel network architecture extends recently developed densely connected convolutional network (densenet), shown excellent results image classification tasks. deal specific problem audio source separation, up-sampling layer, block skip connection band-dedicated dense blocks incorporated top densenet. proposed approach takes advantage long contextual information outperforms state-of-the-art results sisec 2016 competition large margin terms signal-to-distortion ratio. moreover, proposed architecture requires significantly fewer parameters considerably less training time compared methods.",4 "cell assemblies multiple time scales arbitrary lag constellations. hebb's idea cell assembly fundamental unit neural information processing dominated neuroscience like theoretical concept within past 60 years. range different physiological phenomena, precisely synchronized spiking broadly simultaneous rate increases, subsumed term. yet progress area hampered lack statistical tools would enable extract assemblies arbitrary constellations time lags, multiple temporal scales, partly due severe computational burden. present unifying methodological conceptual framework detects assembly structure many different time scales, levels precision, arbitrary internal organization. applying methodology multiple single unit recordings various cortical areas, find universal cortical coding scheme, assembly structure precision significantly depends brain area recorded ongoing task demands.",16 "stochastic weighted function norm regularization. deep neural networks (dnns) become increasingly important due excellent empirical performance wide range problems. however, regularization generally achieved indirect means, largely due complex set functions defined network difficulty measuring function complexity. exists method literature additive regularization based norm function, classically considered statistical learning theory. work, propose sampling-based approximations weighted function norms regularizers deep neural networks. provide, best knowledge, first proof literature np-hardness computing function norms dnns, motivating necessity stochastic optimization strategy. based proposed regularization scheme, stability-based bounds yield $\mathcal{o}(n^{-\frac{1}{2}})$ generalization error proposed regularizer applied convex function sets. demonstrate broad conditions convergence stochastic gradient descent objective, including non-convex function sets defined dnns. finally, empirically validate improved performance proposed regularization strategy convex function sets well dnns real-world classification segmentation tasks.",4 "deep structured learning approach towards automating connectome reconstruction 3d electron micrographs. present deep structured learning method neuron segmentation 3d electron microscopy (em) improves significantly upon state art terms accuracy scalability. method consists 3d u-net classifier predicting affinity graphs voxels, followed iterative region agglomeration. train u-net using new structured loss based malis encourages topological correctness. extension consists two parts: first, $o(n\log(n))$ method compute loss gradient, improving originally proposed $o(n^2)$ algorithm. second, compute gradient two separate passes avoid spurious contributions early training stages. affinity predictions accurate enough simple agglomeration outperforms involved methods used earlier inferior predictions. present results three datasets (cremi, fib, segem) different imaging techniques animals achieve improvements previous results 27%, 15%, 250%. findings suggest single 3d segmentation strategy applied isotropic anisotropic em data. runtime method scales $o(n)$ size volume achieves throughput 2.6 seconds per megavoxel, allowing processing large datasets.",4 "application threshold accepting metaheuristic curriculum based course timetabling. article presents local search approach solution timetabling problems general, particular implementation competition track 3 international timetabling competition 2007 (itc 2007). heuristic search procedure based threshold accepting overcome local optima. stochastic neighborhood proposed implemented, randomly removing reassigning events current solution. overall concept incrementally obtained series experiments, describe (sub)section paper. result, successfully derived potential candidate solution approach finals track 3 itc 2007.",4 "learning spectral-spatial-temporal features via recurrent convolutional neural network change detection multispectral imagery. change detection one central problems earth observation extensively investigated recent decades. paper, propose novel recurrent convolutional neural network (recnn) architecture, trained learn joint spectral-spatial-temporal feature representation unified framework change detection multispectral images. end, bring together convolutional neural network (cnn) recurrent neural network (rnn) one end-to-end network. former able generate rich spectral-spatial feature representations, latter effectively analyzes temporal dependency bi-temporal images. comparison previous approaches change detection, proposed network architecture possesses three distinctive properties: 1) end-to-end trainable, contrast existing methods whose components separately trained computed; 2) naturally harnesses spatial information proven beneficial change detection task; 3) capable adaptively learning temporal dependency multitemporal images, unlike algorithms use fairly simple operation like image differencing stacking. far know, first time recurrent convolutional network architecture proposed multitemporal remote sensing image analysis. proposed network validated real multispectral data sets. visual quantitative analysis experimental results demonstrates competitive performance proposed mode.",4 lambek-grishin calculus np-complete. lambek-grishin calculus lg symmetric extension non-associative lambek calculus nl. paper prove derivability problem lg np-complete.,4 "tree-structured boosting: connections gradient boosted stumps full decision trees. additive models, produced gradient boosting, full interaction models, classification regression trees (cart), widely used algorithms investigated largely isolation. show models exist along spectrum, revealing never-before-known connections two approaches. paper introduces novel technique called tree-structured boosting creating single decision tree, shows method produce models equivalent cart gradient boosted stumps extremes varying single parameter. although tree-structured boosting designed primarily provide model interpretability predictive performance needed high-stake applications like medicine, also produce decision trees represented hybrid models cart boosted stumps outperform either approaches.",19 "two-step fusion process multi-criteria decision applied natural hazards mountains. mountain river torrents snow avalanches generate human material damages dramatic consequences. knowledge natural phenomenona often lacking expertise required decision risk management purposes using multi-disciplinary quantitative qualitative approaches. expertise considered decision process based imperfect information coming less reliable conflicting sources. methodology mixing analytic hierarchy process (ahp), multi-criteria aid-decision method, information fusion using belief function theory described. fuzzy sets possibilities theories allow transform quantitative qualitative criteria common frame discernment decision dempster-shafer theory (dst ) dezert-smarandache theory (dsmt) contexts. main issues consist basic belief assignments elicitation, conflict identification management, fusion rule choices, results validation also specific needs make difference importance reliability uncertainty fusion process.",4 "properties n-dimensional convolution image deconvolution. convolution system linear time invariant, describe optical imaging process. based convolution system, many deconvolution techniques developed optical image analysis, boosting space resolution optical images, image denoising, image enhancement on. here, gave properties n-dimensional convolution. using properties, proposed image deconvolution method. method uses series convolution operations deconvolute image. demonstrated method similar deconvolution results state-of-art method. core calculation proposed method image convolution, thus method easily integrated gpu mode large-scale image deconvolution.",6 "graphconnect: regularization framework neural networks. deep neural networks proved successful domains large training sets available, number training samples small, performance suffers overfitting. prior methods reducing overfitting weight decay, dropout dropconnect data-independent. paper proposes new method, graphconnect, data-dependent, motivated observation data interest lie close manifold. new method encourages relationships learned decisions resemble graph representing manifold structure. essentially graphconnect designed learn attributes present data samples contrast weight decay, dropout dropconnect simply designed make difficult fit random error noise. empirical rademacher complexity used connect generalization error neural network spectral properties graph learned input data. framework used show graphconnect superior weight decay. experimental results several benchmark datasets validate theoretical analysis, show number training samples small, graphconnect able significantly improve performance weight decay.",4 "impact cognitive radio future management spectrum. cognitive radio breakthrough technology expected profound impact way radio spectrum accessed, managed shared future. paper examine implications cognitive radio future management spectrum. near-term view involving opportunistic spectrum access model longer-term view involving self-regulating dynamic spectrum access model within society cognitive radios discussed.",4 "unified convex surrogate schatten-$p$ norm. schatten-$p$ norm ($0
0$ satisfying $1/p=1/p_1+1/p_2$, equivalence schatten-$p$ norm one matrix schatten-$p_1$ schatten-$p_2$ norms two factor matrices. extend equivalence multiple factor matrices show factor norms convex smooth $p>0$. contrast, original schatten-$p$ norm $0
1$, genetic algorithm operates quasispecies regime: advantageous mutant invades positive fraction population probability larger constant $p^*$ (which depend $m$). estimate next probability occurrence catastrophe (the whole population falls fitness level previously reached positive fraction population). asymptotic results suggest following rules: $\pi=\sigma(1-p_c)(1-p_m)^\ell$ slightly larger $1$; $p_m$ order $1/\ell$; $m$ larger $\ell\ln\ell$; running time exponential order $m$. first condition requires $ \ell p_m +p_c< \ln\sigma$. conclusions must taken great care: come asymptotic regime, formidable task understand relevance regime real-world problem. least, hope conclusions provide interesting guidelines practical implementation simple genetic algorithm.",12 "mean deviation similarity index: efficient reliable full-reference image quality evaluator. applications perceptual image quality assessment (iqa) image video processing, image acquisition, image compression, image restoration multimedia communication, led development many iqa metrics. paper, reliable full reference iqa model proposed utilize gradient similarity (gs), chromaticity similarity (cs), deviation pooling (dp). considering shortcomings commonly used gs model human visual system (hvs), new gs proposed fusion technique likely follow hvs. propose efficient effective formulation calculate joint similarity map two chromatic channels purpose measuring color changes. comparison commonly used formulation literature, proposed cs map shown efficient provide comparable better quality predictions. motivated recent work utilizes standard deviation pooling, general formulation dp presented paper used compute final score proposed gs cs maps. proposed formulation dp benefits minkowski pooling proposed power pooling well. experimental results six datasets natural images, synthetic dataset, digitally retouched dataset show proposed index provides comparable better quality predictions recent competing state-of-the-art iqa metrics literature, reliable low complexity. matlab source code proposed metric available https://www.mathworks.com/matlabcentral/fileexchange/59809.",4 "dense rgb-d semantic mapping pixel-voxel neural network. intelligent robotics applications, extending 3d mapping 3d semantic mapping enables robots to, localize respect scene's geometrical features also simultaneously understand higher level meaning scene contexts. previous methods focus geometric 3d reconstruction scene understanding independently notwithstanding fact joint estimation boost accuracy semantic mapping. paper, dense rgb-d semantic mapping system pixel-voxel network proposed, perform dense 3d mapping simultaneously recognizing semantically labelling point 3d map. proposed pixel-voxel network obtains global context information using pixelnet exploit rgb image meanwhile, preserves accurate local shape information using voxelnet exploit corresponding 3d point cloud. unlike existing architecture fuses score maps different models equal weights, proposed softmax weighted fusion stack adaptively learns varying contributions pixelnet voxelnet, fuses score maps two models according respective confidence levels. proposed pixel-voxel network achieves state-of-the-art semantic segmentation performance sun rgb-d benchmark dataset. runtime proposed system boosted 11-12hz, enabling near real-time performance using i7 8-cores pc titan x gpu.",4 "seamless integration coordination cognitive skills humanoid robots: deep learning approach. study investigates adequate coordination among different cognitive processes humanoid robot developed end-to-end learning direct perception visuomotor stream. propose deep dynamic neural network model built dynamic vision network, motor generation network, higher-level network. proposed model designed process integrate direct perception dynamic visuomotor patterns hierarchical model characterized different spatial temporal constraints imposed level. conducted synthetic robotic experiments robot learned read human's intention observing gestures generate corresponding goal-directed actions. results verify proposed model able learn tutored skills generalize novel situations. model showed synergic coordination perception, action decision making, integrated coordinated set cognitive skills including visual perception, intention reading, attention switching, working memory, action preparation execution seamless manner. analysis reveals coherent internal representations emerged level hierarchy. higher-level representation reflecting actional intention developed means continuous integration lower-level visuo-proprioceptive stream.",4 "genetic algorithm (ga) feature selection crf based manipuri multiword expression (mwe) identification. paper deals identification multiword expressions (mwes) manipuri, highly agglutinative indian language. manipuri listed eight schedule indian constitution. mwe plays important role applications natural language processing(nlp) like machine translation, part speech tagging, information retrieval, question answering etc. feature selection important factor recognition manipuri mwes using conditional random field (crf). disadvantage manual selection choosing appropriate features running crf motivates us think genetic algorithm (ga). using ga able find optimal features run crf. tried fifty generations feature selection along three fold cross validation fitness function. model demonstrated recall (r) 64.08%, precision (p) 86.84% f-measure (f) 73.74%, showing improvement crf based manipuri mwe identification without ga application.",4 "towards reverse-engineering black-box neural networks. many deployed learned models black boxes: given input, returns output. internal information model, architecture, optimisation procedure, training data, disclosed explicitly might contain proprietary information make system vulnerable. work shows attributes neural networks exposed sequence queries. multiple implications. one hand, work exposes vulnerability black-box neural networks different types attacks -- show revealed internal information helps generate effective adversarial examples black box model. hand, technique used better protection private content automatic recognition models using adversarial examples. paper suggests actually hard draw line white box black box models.",19 "disguised face identification (dfi) facial keypoints using spatial fusion convolutional network. disguised face identification (dfi) extremely challenging problem due numerous variations introduced using different disguises. paper introduces deep learning framework first detect 14 facial key-points utilized perform disguised face identification. since training deep learning architectures relies large annotated datasets, two annotated facial key-points datasets introduced. effectiveness facial keypoint detection framework presented keypoint. superiority key-point detection framework also demonstrated comparison deep networks. effectiveness classification performance also demonstrated comparison state-of-the-art face disguise classification methods.",4 "saberlda: sparsity-aware learning topic models gpus. latent dirichlet allocation (lda) popular tool analyzing discrete count data text images. applications require lda handle large datasets large number topics. though distributed cpu systems used, gpu-based systems emerged promising alternative high computational power memory bandwidth gpus. however, existing gpu-based lda systems cannot support large number topics use algorithms dense data structures whose time space complexity linear number topics. paper, propose saberlda, gpu-based lda system implements sparsity-aware algorithm achieve sublinear time complexity scales well learn large number topics. address challenges introduced sparsity, propose novel data layout, new warp-based sampling kernel, efficient sparse count matrix updating algorithm improves locality, makes efficient utilization gpu warps, reduces memory consumption. experiments show saberlda learn billions-token-scale data 10,000 topics, almost two orders magnitude larger previous gpu-based systems. single gpu card, saberlda able learn 10,000 topics dataset billions tokens hours, achievable clusters tens machines before.",4 "nested hierarchical dirichlet processes. develop nested hierarchical dirichlet process (nhdp) hierarchical topic modeling. nhdp generalization nested chinese restaurant process (ncrp) allows word follow path topic node according document-specific distribution shared tree. alleviates rigid, single-path formulation ncrp, allowing document easily express thematic borrowings random effect. derive stochastic variational inference algorithm model, addition greedy subtree selection method document, allows efficient inference using massive collections text documents. demonstrate algorithm 1.8 million documents new york times 3.3 million documents wikipedia.",19 "investigating effects diversity mechanisms evolutionary algorithms dynamic environments. evolutionary algorithms successfully applied variety optimisation problems stationary environments. however, many real world optimisation problems set dynamic environments success criteria shifts regularly. population diversity affects algorithmic performance, particularly multiobjective dynamic problems. diversity mechanisms methods altering evolutionary algorithms way promotes maintenance population diversity. project intends measure compare performance effect variety diversity mechanisms evolutionary algorithm facing assortment dynamic problems.",4 "aspect-based opinion summarization convolutional neural networks. paper considers aspect-based opinion summarization (aos) reviews particular products. enable real applications, aos system needs address two core subtasks, aspect extraction sentiment classification. existing approaches aspect extraction, use linguistic analysis topic modeling, general across different products precise enough suitable particular products. instead take less general precise scheme, directly mapping review sentence pre-defined aspects. tackle aspect mapping sentiment classification, propose two convolutional neural network (cnn) based methods, cascaded cnn multitask cnn. cascaded cnn contains two levels convolutional networks. multiple cnns level 1 deal aspect mapping task, single cnn level 2 deals sentiment classification. multitask cnn also contains multiple aspect cnns sentiment cnn, different networks share word embeddings. experimental results indicate cascaded multitask cnns outperform svm-based methods large margins. multitask cnn generally performs better cascaded cnn.",4 "graphs machine learning: introduction. graphs commonly used characterise interactions objects interest. based straightforward formalism, used many scientific fields computer science historical sciences. paper, give introduction methods relying graphs learning. includes unsupervised supervised methods. unsupervised learning algorithms usually aim visualising graphs latent spaces and/or clustering nodes. focus extracting knowledge graph topologies. existing techniques applicable static graphs, edges evolve time, recent developments shown could extended deal evolving networks. supervised context, one generally aims inferring labels numerical values attached nodes using graph and, available, node characteristics. balancing two sources information challenging, especially disagree locally globally. contexts, supervised un-supervised, data relational (augmented one several global graphs) described above, graph valued. latter case, object interest given full graph (possibly completed characteristics). context, natural tasks include graph clustering (as producing clusters graphs rather clusters nodes single graph), graph classification, etc. 1 real networks one first practical studies graphs dated back original work moreno [51] 30s. since then, growing interest graph analysis associated strong developments modelling processing data. graphs used many scientific fields. biology [54, 2, 7], instance, metabolic networks describe pathways biochemical reactions [41], social sciences networks used represent relation ties actors [66, 56, 36, 34]. examples include powergrids [71] web [75]. recently, networks also considered areas geography [22] history [59, 39]. machine learning, networks seen powerful tools model problems order extract information data prediction purposes. object paper. complete surveys, refer [28, 62, 49, 45]. section, introduce notations highlight properties shared real networks. section 2, consider methods aiming extracting information unique network. particularly focus clustering methods goal find clusters vertices. finally, section 3, techniques take series networks account, network",19 "discriminative density-ratio estimation. covariate shift challenging problem supervised learning results discrepancy training test distributions. effective approach recently drew considerable attention research community reweight training samples minimize discrepancy. specific, many methods based developing density-ratio (dr) estimation techniques apply regression classification problems. although methods work well regression problems, performance classification problems satisfactory. due key observation methods focus matching sample marginal distributions without paying attention preserving separation classes reweighted space. paper, propose novel method discriminative density-ratio (ddr) estimation addresses aforementioned problem aims estimating density-ratio joint distributions class-wise manner. proposed algorithm iterative procedure alternates estimating class information test data estimating new density ratio class. incorporate estimated class information test data, soft matching technique proposed. addition, employ effective criterion adopts mutual information indicator stop iterative procedure resulting decision boundary lies sparse region. experiments synthetic benchmark datasets demonstrate superiority proposed method terms accuracy robustness.",4 "recognizing combinations facial action units different intensity using mixture hidden markov models neural network. facial action coding system consists 44 action units (aus) 7000 combinations. hidden markov models (hmms) classifier used successfully recognize facial action units (aus) expressions due ability deal au dynamics. however, separate hmm necessary single au au combination. since combinations au numbering thousands, efficient method needed. paper accurate real-time sequence-based system representation recognition facial aus presented. system following characteristics: 1) employing mixture hmms neural network, develop novel accurate classifier, deal au dynamics, recognize subtle changes, also robust intensity variations, 2) although use hmm single au only, employing neural network recognize single combination au, 3) using geometric appearance-based features, applying efficient dimension reduction techniques, system robust illumination changes represent temporal information involved formation facial expressions. extensive experiments cohn-kanade database show superiority proposed method, comparison classifiers. keywords: classifier design evaluation, data fusion, facial action units (aus), hidden markov models (hmms), neural network (nn).",4 "orthographic syllable basic unit smt related languages. explore use orthographic syllable, variable-length consonant-vowel sequence, basic unit translation related languages use abugida alphabetic scripts. show orthographic syllable level translation significantly outperforms models trained basic units (word, morpheme character) training small parallel corpora.",4 "geometric blind source separation method based facet component analysis. given set mixtures, blind source separation attempts retrieve source signals without little information mixing process. present geometric approach blind separation nonnegative linear mixtures termed {\em facet component analysis} (fca). approach based facet identification underlying cone structure data. earlier works focus recovering cone locating vertices (vertex component analysis vca) based mutual sparsity condition requires source signal possess stand-alone peak spectrum. formulate alternative conditions enough data points fall facets cone instead accumulating around vertices. find regime unique solvability, make use geometric density properties data points, develop efficient facet identification method combining data classification linear regression. noisy data, show denoising methods may employed, total variation technique imaging processing, principle component analysis. show computational results nuclear magnetic resonance spectroscopic data substantiate method.",12 "use non-stationary policies infinite-horizon discounted markov decision processes. consider infinite-horizon $\gamma$-discounted markov decision processes, known exists stationary optimal policy. consider algorithm value iteration sequence policies $\pi_1,...,\pi_k$ implicitely generates iteration $k$. provide performance bounds non-stationary policies involving last $m$ generated policies reduce state-of-the-art bound last stationary policy $\pi_k$ factor $\frac{1-\gamma}{1-\gamma^m}$. particular, use non-stationary policies allows reduce usual asymptotic performance bounds value iteration errors bounded $\epsilon$ iteration $\frac{\gamma}{(1-\gamma)^2}\epsilon$ $\frac{\gamma}{1-\gamma}\epsilon$, significant usual situation $\gamma$ close 1. given bellman operators computed error $\epsilon$, surprising consequence result problem ""computing approximately optimal non-stationary policy"" much simpler ""computing approximately optimal stationary policy"", even slightly simpler ""approximately computing value fixed policy"", since last problem guarantee $\frac{1}{1-\gamma}\epsilon$.",4 serious flaws korf et al.'s analysis time complexity a*. paper withdrawn.,4 "adaptive framework tune coordinate systems evolutionary algorithms. evolutionary computation research community, performance evolutionary algorithms (eas) depends strongly implemented coordinate system. however, commonly used coordinate system fixed well suited different function landscapes, eas thus might search efficiently. overcome shortcoming, paper propose framework, named acos, adaptively tune coordinate systems eas. acos, eigen coordinate system established making use cumulative population distribution information, obtained based covariance matrix adaptation strategy additional archiving mechanism. since population distribution information reflect features function landscape extent, eas eigen coordinate system capability identify modality function landscape. addition, eigen coordinate system coupled original coordinate system, selected according probability vector. probability vector aims determine selection ratio coordinate system individual, adaptively updated based collected information offspring. acos applied two popular ea paradigms, i.e., particle swarm optimization (pso) differential evolution (de), solving 30 test functions 30 50 dimensions 2014 ieee congress evolutionary computation. experimental studies demonstrate effectiveness.",4 "learning compose skills. present differentiable framework capable learning wide variety compositions simple policies call skills. recursively composing skills themselves, create hierarchies display complex behavior. skill networks trained generate skill-state embeddings provided inputs trainable composition function, turn outputs policy overall task. experiments environment consisting multiple collect evade tasks show architecture able quickly build complex skills simpler ones. furthermore, learned composition function displays transfer unseen combinations skills, allowing zero-shot generalizations.",4 "combinatorial pyramids discrete geometry energy-minimizing segmentation. paper defines basis new hierarchical framework segmentation algorithms based energy minimization schemes. new framework based two formal tools. first, combinatorial pyramid encode efficiently hierarchy partitions. secondly, discrete geometric estimators measure precisely important geometric parameters regions. measures combined photometrical topological features partition allows design energy terms based discrete measures. segmentation framework exploits energies build pyramid image partitions minimization scheme. experiments illustrating framework shown discussed.",4 "classification sparse overlapping groups. classification sparsity constraint solution plays central role many high dimensional machine learning applications. cases, features grouped together entire subsets features selected selected. many applications, however, restrictive. paper, interested less restrictive form structured sparse feature selection: assume features grouped according notion similarity, features group need selected task hand. groups comprised disjoint sets features, sometimes referred ""sparse group"" lasso, allows working richer class models traditional group lasso methods. framework generalizes conventional sparse group lasso allowing overlapping groups, additional flexiblity needed many applications one presents challenges. main contribution paper new procedure called sparse overlapping group (sog) lasso, convex optimization program automatically selects similar features classification high dimensions. establish model selection error bounds soglasso classification problems fairly general setting. particular, error bounds first results classification using sparse group lasso. furthermore, general soglasso bound specializes results lasso group lasso, known new. soglasso motivated multi-subject fmri studies functional activity classified using brain voxels features, source localization problems magnetoencephalography (meg), analyzing gene activation patterns microarray data analysis. experiments real synthetic data demonstrate advantages soglasso compared lasso group lasso.",4 "landcover fuzzy logic classification maximumlikelihood. present days remote sensing used application many sectors. remote sensing uses different images like multispectral, hyper spectral ultra spectral. remote sensing image classification one significant method classify image. state classify maximum likelihood classification fuzzy logic. experimenting fuzzy logic like spatial, spectral texture methods different sub methods used image classification.",4 "languages actions, formal grammars qualitive modeling companies. paper discuss methods using language actions, formal languages, grammars qualitative conceptual linguistic modeling companies technological human institutions. main problem following discussion problem find describe language structure external internal flow information companies. anticipate language structure external internal base flows determine structure companies. structure modeling abstract industrial company internal base flow information constructed certain flow words composed theoretical parts-processes-actions language. language procedures found external base flow information insurance company. formal stochastic grammar language procedures found statistical methods used understanding tendencies health care industry. present model human communications random walk semantic tree",4 "data granulation principles uncertainty. researches granular modeling produced variety mathematical models, intervals, (higher-order) fuzzy sets, rough sets, shadowed sets, suitable characterize so-called information granules. modeling input data uncertainty recognized crucial aspect information granulation. moreover, uncertainty well-studied concept many mathematical settings, probability theory, fuzzy set theory, possibility theory. fact suggests appropriate quantification uncertainty expressed information granule model could used define invariant property, exploited practical situations information granulation. perspective, procedure information granulation effective uncertainty conveyed synthesized information granule monotonically increasing relation uncertainty input data. paper, present data granulation framework elaborates principles uncertainty introduced klir. uncertainty mesoscopic descriptor systems data, possible apply principles regardless input data type specific mathematical setting adopted information granules. proposed framework conceived (i) offer guideline synthesis information granules (ii) build groundwork compare quantitatively judge different data granulation procedures. provide suitable case study, introduce new data granulation technique based minimum sum distances, designed generate type-2 fuzzy sets. analyze procedure performing different experiments two distinct data types: feature vectors labeled graphs. results show uncertainty input data suitably conveyed generated type-2 fuzzy set models.",4 "read-bad: new dataset evaluation scheme baseline detection archival documents. text line detection crucial application associated automatic text recognition keyword spotting. modern algorithms perform good well-established datasets since either comprise clean data simple/homogeneous page layouts. collected annotated 2036 archival document images different locations time periods. dataset contains varying page layouts degradations challenge text line segmentation methods. well established text line segmentation evaluation schemes detection rate recognition accuracy demand binarized data annotated pixel level. producing ground truth means laborious needed determine method's quality. paper propose new evaluation scheme based baselines. proposed scheme need binarization handle skewed well rotated text lines. icdar 2017 competition baseline detection icdar 2017 competition layout analysis challenging medieval manuscripts used evaluation scheme. finally, present results achieved recently published text line detection algorithm.",4 "modularity component analysis versus principal component analysis. paper exact linear relation leading eigenvectors modularity matrix singular vectors uncentered data matrix developed. based analysis concept modularity component defined, properties developed. shown modularity component analysis used cluster data similar traditional principal component analysis used except modularity component analysis require data centering.",19 "mining determinism human strategic behavior. work lies fusion experimental economics data mining. continues author's previous work mining behaviour rules human subjects experimental data, game-theoretic predictions partially fail work. game-theoretic predictions aka equilibria tend success experienced subjects specific games, rarely given. apart game theory, contemporary experimental economics offers number alternative models. relevant literature, models always biased psychological near-psychological theories claimed proven data. work introduces data mining approach problem without using vast psychological background. apart determinism, biases regarded. two datasets different human subject experiments taken evaluation. first one repeated mixed strategy zero sum game second - repeated ultimatum game. result, way mining deterministic regularities human strategic behaviour described evaluated. future work, design new representation formalism discussed.",4 "learning inverse mappings adversarial criterion. propose flipped-adversarial autoencoder (faae) simultaneously trains generative model g maps arbitrary latent code distribution data distribution encoder e embodies ""inverse mapping"" encodes data sample latent code vector. unlike previous hybrid approaches leverage adversarial training criterion constructing autoencoders, faae minimizes re-encoding errors latent space exploits adversarial criterion data space. experimental evaluations demonstrate proposed framework produces sharper reconstructed images time enabling inference captures rich semantic representation data.",4 "encoding monotonic multi-set preferences using ci-nets: preliminary report. cp-nets variants constitute one main ai approaches specifying reasoning preferences. ci-nets, particular, cp-inspired formalism representing ordinal preferences sets goods, typically required monotonic. considering also goods often come multi-sets rather sets, natural question whether ci-nets used less directly encode preferences multi-sets. provide initial ideas achieve this, sense least restricted form reasoning framework, call ""confined reasoning"", efficiently reduced reasoning ci-nets. framework nevertheless allows encoding preferences multi-sets unbounded multiplicities. also show extent used represent preferences multiplicites goods stated explicitly (""purely qualitative preferences"") well potential use generalization ci-nets component recent system evidence aggregation.",4 "probabilistic reasoning information compression multiple alignment, unification search: introduction overview. article introduces idea probabilistic reasoning (pr) may understood ""information compression multiple alignment, unification search"" (icmaus). context, multiple alignment meaning similar distinct meaning bio-informatics, unification means simple merging matching patterns, meaning related simpler meaning term logic. software model, sp61, developed discovery formation 'good' multiple alignments, evaluated terms information compression. model described outline. using examples sp61 model, article describes outline icmaus framework model various kinds pr including: pr best-match pattern recognition information retrieval; one-step 'deductive' 'abductive' pr; inheritance attributes class hierarchy; chains reasoning (probabilistic decision networks decision trees, pr 'rules'); geometric analogy problems; nonmonotonic reasoning reasoning default values; modelling function bayesian network.",4 "adaptive gril estimator diverging number parameters. consider problem variables selection estimation linear regression model situations number parameters diverges sample size. propose adaptive generalized ridge-lasso (\mbox{adagril}) extension adaptive elastic net. adagril incorporates information redundancy among correlated variables model selection estimation. combines strengths quadratic regularization adaptively weighted lasso shrinkage. paper, highlight grouped selection property adacnet method (one type adagril) equal correlation case. weak conditions, establish oracle property adagril ensures optimal large performance dimension high. consequently, achieves goals handling problem collinearity high dimension enjoys oracle property. moreover, show adagril estimator achieves sparsity inequality, i. e., bound terms number non-zero components 'true' regression coefficient. bound obtained similar weak restricted eigenvalue (re) condition used lasso. simulations studies show particular cases adagril outperform competitors.",19 "accelerating neural architecture search using performance prediction. methods neural network hyperparameter optimization meta-modeling computationally expensive due need train large number model configurations. paper, show standard frequentist regression models predict final performance partially trained model configurations using features based network architectures, hyperparameters, time-series validation performance data. empirically show performance prediction models much effective prominent bayesian counterparts, simpler implement, faster train. models predict final performance visual classification language modeling domains, effective predicting performance drastically varying model architectures, even generalize model classes. using prediction models, also propose early stopping method hyperparameter optimization meta-modeling, obtains speedup factor 6x hyperparameter optimization meta-modeling. finally, empirically show early stopping method seamlessly incorporated reinforcement learning-based architecture selection algorithms bandit based search methods. extensive experimentation, empirically show performance prediction models early stopping algorithm state-of-the-art terms prediction accuracy speedup achieved still identifying optimal model configurations.",4 "9-im topological operators qualitative spatial relations using 3d selective nef complexes logic rules bodies. paper presents method compute automatically topological relations using swrl rules. calculation rules based definition selective nef complexes nef polyhedra structure generated standard polyhedron. selective nef complexes data model providing set binary boolean operators union, difference, intersection symmetric difference, unary operators interior, closure boundary. work, operators used compute topological relations objects defined constraints 9 intersection model (9-im) egenhofer. help constraints, defined procedure compute topological relations nef polyhedra. topological relationships disjoint, meets, contains, inside, covers, coveredby, equals overlaps, defined top-level ontology specific semantic definition relation transitive, symmetric, asymmetric, functional, reflexive, irreflexive. results computation topological relationships stored owl-dl ontology allowing infer new relationships objects. addition, logic rules based semantic web rule language allows definition logic programs define topological relationships computed kind objects specific attributes. instance, ""building"" overlaps ""railway"" ""railstation"".",4 "handwritten digit recognition committee deep neural nets gpus. competitive mnist handwritten digit recognition benchmark long history broken records since 1998. recent substantial improvement others dates back 7 years (error rate 0.4%) . recently able significantly improve result, using graphics cards greatly speed training simple deep mlps, achieved 0.35%, outperforming previous complex methods. report another substantial improvement: 0.31% obtained using committee mlps.",4 "advanced mean field theory restricted boltzmann machine. learning restricted boltzmann machine typically hard due computation gradients log-likelihood function. describe network state statistics restricted boltzmann machine, develop advanced mean field theory based bethe approximation. theory provides efficient message passing based method evaluates partition function (free energy) also gradients without requiring statistical sampling. results compared obtained computationally expensive sampling based method.",3 "priors initial hyperparameters affect gaussian process regression models. hyperparameters gaussian process regression (gpr) model specified kernel often estimated data via maximum marginal likelihood. due non-convexity marginal likelihood respect hyperparameters, optimization may converge global maxima. common approach tackle issue use multiple starting points randomly selected specific prior distribution. result choice prior distribution may play vital role predictability approach. however, exists little research literature study impact prior distributions hyperparameter estimation performance gpr. paper, provide first empirical study problem using simulated real data experiments. consider different types priors initial values hyperparameters commonly used kernels investigate influence priors predictability gpr models. results reveal that, kernel chosen, different priors initial hyperparameters significant impact performance gpr prediction, despite estimates hyperparameters different true values cases.",19 "school bus routing maximizing trip compatibility. school bus planning usually divided routing scheduling due complexity solving concurrently. however, separation two steps may lead worse solutions higher overall costs solving together. finding minimal number trips routing problem, neglecting importance trip compatibility may increase number buses actually needed scheduling problem. paper proposes new formulation multi-school homogeneous fleet routing problem maximizes trip compatibility minimizing total travel time. incorporates trip compatibility scheduling problem routing problem. since problem inherently routing problem, finding good solution cumbersome. compare performance model traditional routing problems, generate eight mid-size data sets. importing generated trips routing problems bus scheduling (blocking) problem, shown proposed model uses 13% fewer buses common traditional routing models.",12 "unfolding partiality disjunctions stable model semantics. paper studies implementation methodology partial disjunctive stable models partiality disjunctions unfolded logic program implementation stable models normal (disjunction-free) programs used core inference engine. unfolding done two separate steps. firstly, shown partial stable models captured total stable models using simple linear modular program transformation. hence, reasoning tasks concerning partial stable models solved using implementation total stable models. disjunctive partial stable models lacking implementations become available translation handles also disjunctive case. secondly, shown total stable models disjunctive programs determined computing stable models normal programs. hence, implementation stable models normal programs used core engine implementing disjunctive programs. feasibility approach demonstrated constructing system computing stable models disjunctive programs using smodels system core engine. performance resulting system compared dlv state-of-the-art special purpose system disjunctive programs.",4 "learning reporting dynamics breaking news rumour detection social media. breaking news leads situations fast-paced reporting social media, producing kinds updates related news stories, albeit caveat early updates tend rumours, i.e., information unverified status time posting. flagging information unverified helpful avoid spread information may turn false. detection rumours also feed rumour tracking system ultimately determines veracity. paper introduce novel approach rumour detection learns sequential dynamics reporting breaking news social media detect rumours new stories. using twitter datasets collected five breaking news stories, experiment conditional random fields sequential classifier leverages context learnt event rumour detection, compare state-of-the-art rumour detection system well baselines. contrast existing work, classifier need observe tweets querying piece information deem rumour, instead detect rumours tweet alone exploiting context learnt event. classifier achieves competitive performance, beating state-of-the-art classifier relies querying tweets improved precision recall, well outperforming best baseline nearly 40% improvement terms f1 score. scale diversity experiments reinforces generalisability classifier.",4 similar elements metric labeling complete graphs. consider problem involves finding similar elements collection sets. problem motivated applications machine learning pattern recognition. formulate similar elements problem optimization give efficient approximation algorithm finds solution within factor 2 optimal. similar elements problem special case metric labeling problem also give efficient 2-approximation algorithm metric labeling problem complete graphs.,4 "early stage influenza detection twitter. influenza acute respiratory illness occurs virtually every year results substantial disease, death expense. detection influenza earliest stage would facilitate timely action could reduce spread illness. existing systems cdc eiss try collect diagnosis data, almost entirely manual, resulting two-week delays clinical data acquisition. twitter, popular microblogging service, provides us perfect source early-stage flu detection due real- time nature. example, flu breaks out, people get flu may post related tweets enables detection flu breakout promptly. paper, investigate real-time flu detection problem twitter data proposing flu markov network (flu-mn): spatio-temporal unsupervised bayesian algorithm based 4 phase markov network, trying identify flu breakout earliest stage. test model real twitter datasets united states along baselines multiple applications, real-time flu breakout detection, future epidemic phase prediction, influenza-like illness (ili) physician visits. experimental results show robustness effectiveness approach. build real time flu reporting system based proposed approach, hopeful would help government health organizations identifying flu outbreaks facilitating timely actions decrease unnecessary mortality.",4 "diffusion component analysis: unraveling functional topology biological networks. complex biological systems successfully modeled biochemical genetic interaction networks, typically gathered high-throughput (htp) data. networks used infer functional relationships genes proteins. using intuition topological role gene network relates biological function, local diffusion based ""guilt-by-association"" graph-theoretic methods success inferring gene functions. seek improve function prediction integrating diffusion-based methods novel dimensionality reduction technique overcome incomplete noisy nature network data. paper, introduce diffusion component analysis (dca), framework plugs diffusion model learns low-dimensional vector representation node encode topological properties network. proof concept, demonstrate dca's substantial improvement state-of-the-art diffusion-based approaches predicting protein function molecular interaction networks. moreover, dca framework integrate multiple networks heterogeneous sources, consisting genomic information, biochemical experiments resources, even improve function prediction. yet another layer performance gain achieved integrating dca framework support vector machines take node vector representations features. overall, dca framework provides novel representation nodes network used plug-in architecture machine learning algorithms decipher topological properties obtain novel insights interactomes.",16 "surrogate model assisted cooperative coevolution large scale optimization. shown cooperative coevolution (cc) effectively deal large scale optimization problems (lsops) divide-and-conquer strategy. however, performance severely restricted current context-vector-based sub-solution evaluation method since method needs access original high dimensional simulation model evaluating sub-solution thus requires many computation resources. alleviate issue, study proposes novel surrogate model assisted cooperative coevolution (sacc) framework. sacc constructs surrogate model sub-problem obtained via decomposition employs evaluate corresponding sub-solutions. original simulation model adopted reevaluate good sub-solutions selected surrogate models, real evaluated sub-solutions turn employed update surrogate models. means, computation cost could greatly reduced without significantly sacrificing evaluation quality. show efficiency sacc, study uses radial basis function (rbf) success-history based adaptive differential evolution (shade) surrogate model optimizer, respectively. rbf shade proved effective small medium scale problems. study first scales lsops 1000 dimensions sacc framework, tailored certain extent adapting characteristics lsop sacc. empirical studies ieee cec 2010 benchmark functions demonstrate sacc significantly enhances evaluation efficiency sub-solutions, even much fewer computation resource, resultant rbf-shade-sacc algorithm able find much better solutions traditional cc algorithms.",4 "mixing energy models genetic algorithms on-lattice protein structure prediction. protein structure prediction (psp) computationally challenging problem. challenge largely comes fact energy function needs minimised order obtain native structure given protein clearly known. high resolution 20x20 energy model could better capture behaviour actual energy function low resolution energy model hydrophobic polar. however, fine grained details high resolution interaction energy matrix often informative guiding search. contrast, low resolution energy model could effectively bias search towards certain promising directions. paper, develop genetic algorithm mainly uses high resolution energy model protein structure evaluation uses low resolution hp energy model focussing search towards exploring structures hydrophobic cores. experimentally show mixing energy models leads significant lower energy structures compared state-of-the-art results.",4 "human-grounded evaluation benchmark local explanations machine learning. order people able trust take advantage results advanced machine learning artificial intelligence solutions real decision making, people need able understand machine rationale given output. research explain artificial intelligence (xai) addresses aim, need evaluation human relevance understandability explanations. work contributes novel methodology evaluating quality human interpretability explanations machine learning models. present evaluation benchmark instance explanations text image classifiers. explanation meta-data benchmark generated user annotations image text samples. describe benchmark demonstrate utility quantitative evaluation explanations generated recent machine learning algorithm. research demonstrates human-grounded evaluation could used measure qualify local machine-learning explanations.",4 "anisotropic diffusion-based kernel matrix model face liveness detection. facial recognition verification widely used biometric technology security system. unfortunately, face biometrics vulnerable spoofing attacks using photographs videos. paper, present anisotropic diffusion-based kernel matrix model (adkmm) face liveness detection prevent face spoofing attacks. use anisotropic diffusion enhance edges boundary locations face image, kernel matrix model extract face image features call diffusion-kernel (d-k) features. d-k features reflect inner correlation face image sequence. introduce convolution neural networks extract deep features, then, employ generalized multiple kernel learning method fuse d-k features deep features achieve better performance. experimental evaluation two publicly available datasets shows proposed method outperforms state-of-art face liveness detection methods.",4 "learning hidden unit contributions unsupervised acoustic model adaptation. work presents broad study adaptation neural network acoustic models means learning hidden unit contributions (lhuc) -- method linearly re-combines hidden units speaker- environment-dependent manner using small amounts unsupervised adaptation data. also extend lhuc speaker adaptive training (sat) framework leads adaptable dnn acoustic model, working speaker-dependent speaker-independent manner, without requirements maintain auxiliary speaker-dependent feature extractors introduce significant speaker-dependent changes dnn structure. series experiments four different speech recognition benchmarks (ted talks, switchboard, ami meetings, aurora4) comprising 270 test speakers, show lhuc test-only sat variants results consistent word error rate reductions ranging 5% 23% relative depending task degree mismatch training test data. addition, investigated effect amount adaptation data per speaker, quality unsupervised adaptation targets, complementarity adaptation techniques, one-shot adaptation, extension adapting dnns trained sequence discriminative manner.",4 "polarimetric hierarchical semantic model scattering mechanism based polsar image classification. polarimetric sar (polsar) image classification, challenge classify aggregated terrain types, urban area, semantic homogenous regions due sharp bright-dark variations intensity. aggregated terrain type formulated similar ground objects aggregated together. paper, polarimetric hierarchical semantic model (phsm) firstly proposed overcome disadvantage based constructions primal-level middle-level semantic. primal-level semantic polarimetric sketch map consists sketch segments sparse representation polsar image. middle-level semantic region map extract semantic homogenous regions sketch map exploiting topological structure sketch segments. mapping region map polsar image, complex polsar scene partitioned aggregated, structural homogenous pixel-level subspaces characteristics relatively coherent terrain types subspace. then, according characteristics three subspaces above, three specific methods adopted, furthermore polarimetric information exploited improve segmentation result. experimental results polsar data sets different bands sensors demonstrate proposed method superior state-of-the-art methods region homogeneity edge preservation terrain classification.",4 "occurrence statistics entities, relations types web. problem collecting reliable estimates occurrence entities open web forms premise report. models learned tagging entities cannot expected perform well deployed web. owing severe mismatch distributions entities web relatively diminutive training data. report, build case maximum mean discrepancy estimation occurrence statistics entities web, taking review named entity disambiguation techniques related concepts along way.",4 turkish pos tagging reducing sparsity morpheme tags small datasets. sparsity one major problems natural language processing. problem becomes even severe agglutinating languages highly prone inflected. deal sparsity turkish adopting morphological features part-of-speech tagging. learn inflectional derivational morpheme tags turkish using conditional random fields (crf) employ morpheme tags part-of-speech (pos) tagging using hidden markov models (hmms) mitigate sparsity. results show using morpheme tags pos tagging helps alleviate sparsity emission probabilities. model outperforms hidden markov model based pos tagging models small training datasets turkish. obtain accuracy 94.1% morpheme tagging 89.2% pos tagging 5k training dataset.,4 "online influence maximization independent cascade model semi-bandit feedback. study stochastic online problem learning influence social network semi-bandit feedback, observe users influence other. problem combines challenges limited feedback, learning agent observes influenced portion network, combinatorial number actions, cardinality feasible set exponential maximum number influencers. propose computationally efficient ucb-like algorithm, imlinucb, analyze it. regret bounds polynomial quantities interest; reflect structure network probabilities influence. moreover, depend inherently large quantities, cardinality action set. best knowledge, first results. imlinucb permits linear generalization therefore suitable large-scale problems. experiments show regret imlinucb scales suggested upper bounds several representative graph topologies; based linear generalization, imlinucb significantly reduce regret real-world influence maximization semi-bandits.",4 "spatially transformed adversarial examples. recent studies show widely used deep neural networks (dnns) vulnerable carefully crafted adversarial examples. many advanced algorithms proposed generate adversarial examples leveraging $\mathcal{l}_p$ distance penalizing perturbations. researchers explored different defense methods defend adversarial attacks. effectiveness $\mathcal{l}_p$ distance metric perceptual quality remains active research area, paper instead focus different type perturbation, namely spatial transformation, opposed manipulating pixel values directly prior works. perturbations generated spatial transformation could result large $\mathcal{l}_p$ distance measures, extensive experiments show spatially transformed adversarial examples perceptually realistic difficult defend existing defense systems. potentially provides new direction adversarial example generation design corresponding defenses. visualize spatial transformation based perturbation different examples show technique produce realistic adversarial examples smooth image deformation. finally, visualize attention deep networks different types adversarial examples better understand examples interpreted.",4 "cooperative multi-agent planning: survey. cooperative multi-agent planning (map) relatively recent research field combines technologies, algorithms techniques developed artificial intelligence planning multi-agent systems communities. planning generally treated single-agent task, map generalizes concept considering multiple intelligent agents work cooperatively develop course action satisfies goals group. paper reviews relevant approaches map, putting focus solvers took part 2015 competition distributed multi-agent planning, classifies according key features relative performance.",4 "evaluation explore-exploit policies multi-result ranking systems. analyze problem using explore-exploit techniques improve precision multi-result ranking systems web search, query autocompletion news recommendation. adopting exploration policy directly online, without understanding impact production system, may unwanted consequences - system may sustain large losses, create user dissatisfaction, collect exploration data help improve ranking quality. offline framework thus necessary let us decide policy apply production environment ensure positive outcome. here, describe offline framework. using framework, study popular exploration policy - thompson sampling. show different ways implementing multi-result ranking systems, different semantic interpretation leading different results terms sustained click-through-rate (ctr) loss expected model improvement. particular, demonstrate thompson sampling act online learner optimizing ctr, cases lead interesting outcome: lift ctr exploration. observation important production systems suggests one get valuable exploration data improve ranking performance long run, time increase ctr exploration lasts.",4 "3d camouflaging object using rgb-d sensors. paper proposes new optical camouflage system uses rgb-d cameras, acquiring point cloud background scene, tracking observers eyes. system enables user conceal object located behind display surrounded 3d objects. considered tracked point observer eyes light source, system work estimating shadow shape display device falls objects background. system uses 3d observer eyes locations display corners predict shadow points nearest neighbors constructed point cloud background scene.",4 "pitman-yor diffusion trees. introduce pitman yor diffusion tree (pydt) hierarchical clustering, generalization dirichlet diffusion tree (neal, 2001) removes restriction binary branching structure. generative process described shown result exchangeable distribution data points. prove theoretical properties model present two inference methods: collapsed mcmc sampler allows us model uncertainty tree structures, computationally efficient greedy bayesian em search algorithm. algorithms use message passing tree structure. utility model algorithms demonstrated synthetic real world data, continuous binary.",19 "machine learning methods analyze arabidopsis thaliana plant root growth. one challenging problems biology classify plants based reaction genetic mutation. arabidopsis thaliana plant interesting, genetic structure similarities human beings. biologists classify type plant mutated mutated (wild) types. phenotypic analysis types time-consuming costly effort individuals. paper, propose modified feature extraction step using velocity acceleration root growth. second step, plant classification, employed different support vector machine (svm) kernels two hybrid systems neural networks. gated negative correlation learning (gncl) mixture negatively correlated experts (mnce) two ensemble methods based complementary feature classical classifiers; mixture expert (me) negative correlation learning (ncl). hybrid systems conserve advantages decrease effects disadvantages ncl me. experimental shows mnce gncl improve efficiency classical classifiers, however, svm kernels function better performance classifiers based neural network ensemble method. moreover, kernels consume less time obtain classification rate.",4 "continuous features discretization anomaly intrusion detectors generation. network security growing issue, evolution computer systems expansion attacks. biological systems inspiring scientists designs new adaptive solutions, genetic algorithms. paper, present approach uses genetic algorithm generate anomaly net- work intrusion detectors. paper, algorithm propose use discretization method continuous features selected intrusion detection, create homogeneity values, different data types. then,the intrusion detection system tested nsl-kdd data set using different distance methods. comparison held amongst results, shown end proposed approach good results, recommendations given future experiments.",4 "driven distraction: self-supervised distractor learning robust monocular visual odometry urban environments. present self-supervised approach ignoring ""distractors"" camera images purposes robustly estimating vehicle motion cluttered urban environments. leverage offline multi-session mapping approaches automatically generate per-pixel ephemerality mask depth map input image, use train deep convolutional network. run-time use predicted ephemerality depth input monocular visual odometry (vo) pipeline, using either sparse features dense photometric matching. approach yields metric-scale vo using single camera recover correct egomotion even 90% image obscured dynamic, independently moving objects. evaluate robust vo methods 400km driving oxford robotcar dataset demonstrate reduced odometry drift significantly improved egomotion estimation presence large moving vehicles urban traffic.",4 "arabic keyphrase extraction using linguistic knowledge machine learning techniques. paper, supervised learning technique extracting keyphrases arabic documents presented. extractor supplied linguistic knowledge enhance efficiency instead relying statistical information term frequency distance. analysis, annotated arabic corpus used extract required lexical features document words. knowledge also includes syntactic rules based part speech tags allowed word sequences extract candidate keyphrases. work, abstract form arabic words used instead stem form represent candidate terms. abstract form hides inflections found arabic words. paper introduces new features keyphrases based linguistic knowledge, capture titles subtitles document. simple anova test used evaluate validity selected features. then, learning model built using lda - linear discriminant analysis - training documents. although, presented system trained using documents domain, experiments carried show significantly better performance existing arabic extractor systems, precision recall values reach double corresponding values systems especially lengthy non-scientific articles.",4 "iit bombay english-hindi parallel corpus. present iit bombay english-hindi parallel corpus. corpus compilation parallel corpora previously available public domain well new parallel corpora collected. corpus contains 1.49 million parallel segments, 694k segments previously available public domain. corpus pre-processed machine translation, report baseline phrase-based smt nmt translation results corpus. corpus used two editions shared tasks workshop asian language transation (2016 2017). corpus freely available non-commercial research. best knowledge, largest publicly available english-hindi parallel corpus.",4 "adaptive seeding gaussian mixture models. present new initialization methods expectation-maximization algorithm multivariate gaussian mixture models. methods adaptions well-known $k$-means++ initialization gonzalez algorithm. thereby aim close gap simple random, e.g. uniform, complex methods, crucially depend right choice hyperparameters. extensive experiments indicate usefulness methods compared common techniques methods, e.g. apply original $k$-means++ gonzalez directly, respect artificial well real-world data sets.",4 "gaussian process domain experts model adaptation facial behavior analysis. present novel approach supervised domain adaptation based upon probabilistic framework gaussian processes (gps). specifically, introduce domain-specific gps local experts facial expression classification face images. adaptation classifier facilitated probabilistic fashion conditioning target expert multiple source experts. furthermore, contrast existing adaptation approaches, also learn target expert available target data solely. then, single confident classifier obtained combining predictions multiple experts based confidence. learning model efficient requires retraining/reweighting source classifiers. evaluate proposed approach two publicly available datasets multi-class (multipie) multi-label (disfa) facial expression classification. end, perform adaptation two contextual factors: 'where' (view) 'who' (subject). show experiments proposed approach consistently outperforms source target classifiers, using 30 target examples. also outperforms state-of-the-art approaches supervised domain adaptation.",19 "revisiting kernelized locality-sensitive hashing improved large-scale image retrieval. present simple powerful reinterpretation kernelized locality-sensitive hashing (klsh), general popular method developed vision community performing approximate nearest-neighbor searches arbitrary reproducing kernel hilbert space (rkhs). new perspective based viewing steps klsh algorithm appropriately projected space, several key theoretical practical benefits. first, eliminates problematic conceptual difficulties present existing motivation klsh. second, yields first formal retrieval performance bounds klsh. third, analysis reveals two techniques boosting empirical performance klsh. evaluate extensions several large-scale benchmark image retrieval data sets, show analysis leads improved recall performance least 12%, sometimes much higher, standard klsh method.",4 "question answering natural language understanding system based object-oriented semantics. algorithms question answering computer system oriented input logical processing text information presented. knowledge domain consideration social behavior person. database system includes internal representation natural language sentences supplemental information. answer {\it yes} {\it no} formed general question. special question containing interrogative word group interrogative words permits find subject, object, place, time, cause, purpose way action event. answer generation based identification algorithms persons, organizations, machines, things, places, times. proposed algorithms question answering realized information systems closely connected text processing (criminology, operation business, medicine, document systems).",4 "functional decision theory: new theory instrumental rationality. paper describes motivates new decision theory known functional decision theory (fdt), distinct causal decision theory evidential decision theory. functional decision theorists hold normative principle action treat one's decision output fixed mathematical function answers question, ""which output function would yield best outcome?"" adhering principle delivers number benefits, including ability maximize wealth array traditional decision-theoretic game-theoretic problems cdt edt perform poorly. using one simple coherent decision rule, functional decision theorists (for example) achieve utility cdt newcomb's problem, utility edt smoking lesion problem, utility parfit's hitchhiker problem. paper, define fdt, explore prescriptions number different decision problems, compare cdt edt, give philosophical justifications fdt normative theory decision-making.",4 "network intrusions detection system based quantum bio inspired algorithm. network intrusion detection systems (nidss) role identifying malicious activities monitoring behavior networks. due currently high volume networks trafic addition increased number attacks dynamic properties, nidss challenge improving classification performance. bio-inspired optimization algorithms (bios) used automatically extract discrimination rules normal abnormal behavior improve classification accuracy detection ability nids. quantum vaccined immune clonal algorithm estimation distribution algorithm (qvica-with eda) proposed paper build new nids. proposed algorithm used classification algorithm new nids trained tested using kdd data set. also, new nids compared another detection system based particle swarm optimization (pso). results shows ability proposed algorithm achieving high intrusions classification accuracy highest obtained accuracy 94.8 %.",4 "simple, efficient, neural algorithms sparse coding. sparse coding basic task many fields including signal processing, neuroscience machine learning goal learn basis enables sparse representation given set data, one exists. standard formulation non-convex optimization problem solved practice heuristics based alternating minimization. re- cent work resulted several algorithms sparse coding provable guarantees, somewhat surprisingly outperformed simple alternating minimization heuristics. give general framework understanding alternating minimization leverage analyze existing heuristics design new ones also provable guarantees. algorithms seem implementable simple neural architectures, original motivation olshausen field (1997a) introducing sparse coding. also give first efficient algorithm sparse coding works almost information theoretic limit sparse recovery incoherent dictionaries. previous algorithms approached surpassed limit run time exponential natural parameter. finally, algorithms improve upon sample complexity existing approaches. believe analysis framework applications settings simple iterative algorithms used.",4 "computational model affects. article provides simple logical structure, affective concepts (i.e. concepts related emotions feelings) defined. set affects defined similar set emotions covered occ model (ortony a., collins a., clore g. l.: cognitive structure emotions. cambridge university press, 1988), model presented article fully computationally defined.",4 "noisy power method: meta algorithm applications. provide new robust convergence analysis well-known power method computing dominant singular vectors matrix call noisy power method. result characterizes convergence behavior algorithm significant amount noise introduced matrix-vector multiplication. noisy power method seen meta-algorithm recently found number important applications broad range machine learning problems including alternating minimization matrix completion, streaming principal component analysis (pca), privacy-preserving spectral analysis. general analysis subsumes several existing ad-hoc convergence bounds resolves number open problems multiple applications including streaming pca privacy-preserving singular vector computation.",4 "approximation guarantees greedy low rank optimization. provide new approximation guarantees greedy low rank matrix estimation standard assumptions restricted strong convexity smoothness. novel analysis also uncovers previously unknown connections low rank estimation combinatorial optimization, much bounds reminiscent corresponding approximation bounds submodular maximization. additionally, also provide statistical recovery guarantees. finally, present empirical comparison greedy estimation established baselines two important real-world problems.",19 "computational models: bottom-up top-down aspects. computational models visual attention become popular past decade, believe primarily two reasons: first, models make testable predictions explored experimentalists well theoreticians, second, models practical technological applications interest applied science engineering communities. chapter, take critical look recent attention modeling efforts. focus {\em computational models attention} defined tsotsos \& rothenstein \shortcite{tsotsos_rothenstein11}: models process visual stimulus (typically, image video clip), possibly also given task definition, make predictions compared human animal behavioral physiological responses elicited stimulus task. thus, place less emphasis abstract models, phenomenological models, purely data-driven fitting extrapolation models, models specifically designed single task restricted class stimuli. theoretical models, refer reader number previous reviews address attention theories models generally \cite{itti_koch01nrn,paletta_etal05,frintrop_etal10,rothenstein_tsotsos08,gottlieb_balan10,toet11,borji_itti12pami}.",4 "bridging neural machine translation bilingual dictionaries. neural machine translation (nmt) become new state-of-the-art several language pairs. however, remains challenging problem integrate nmt bilingual dictionary mainly contains words rarely never seen bilingual training data. paper, propose two methods bridge nmt bilingual dictionaries. core idea behind design novel models transform bilingual dictionaries adequate sentence pairs, nmt distil latent bilingual mappings ample repetitive phenomena. one method leverages mixed word/character model attempts synthesizing parallel sentences guaranteeing massive occurrence translation lexicon. extensive experiments demonstrate proposed methods remarkably improve translation quality, rare words test sentences obtain correct translations covered dictionary.",4 "generic deep networks wavelet scattering. introduce two-layer wavelet scattering network, object classification. scattering transform computes spatial wavelet transform first layer new joint wavelet transform along spatial, angular scale variables second layer. numerical experiments demonstrate two layer convolution network, involves learning max pooling, performs efficiently complex image data sets caltech, structural objects variability clutter. opens possibility simplify deep neural network learning initializing first layers wavelet filters.",4 "informed sampler: discriminative approach bayesian inference generative computer vision models. computer vision hard large variability lighting, shape, texture; addition image signal non-additive due occlusion. generative models promised account variability accurately modelling image formation process function latent variables prior beliefs. bayesian posterior inference could then, principle, explain observation. intuitively appealing, generative models computer vision largely failed deliver promise due difficulty posterior inference. result community favoured efficient discriminative approaches. still believe usefulness generative models computer vision, argue need leverage existing discriminative even heuristic computer vision methods. implement idea principled way ""informed sampler"" careful experiments demonstrate challenging generative models contain renderer programs components. concentrate problem inverting existing graphics rendering engine, approach understood ""inverse graphics"". informed sampler, using simple discriminative proposals based existing computer vision technology, achieves significant improvements inference.",4 "optimal auctions deep learning. designing auction maximizes expected revenue intricate task. indeed, today--despite major efforts impressive progress past years--only single-item case fully understood. work, initiate exploration use tools deep learning topic. design objective revenue optimal, dominant-strategy incentive compatible auctions. show multi-layer neural networks learn almost-optimal auctions settings analytical solutions, myerson's auction single item, manelli vincent's mechanism single bidder additive preferences two items, yao's auction two additive bidders binary support distributions multiple items, even prior knowledge form optimal auctions encoded network feedback training revenue regret. show characterization results, even rather implicit ones rochet's characterization induced utilities gradients, leveraged obtain precise fits optimal design. conclude demonstrating potential deep learning deriving optimal auctions high revenue poorly understood problems.",4 "multi-step-ahead time series prediction using multiple-output support vector regression. accurate time series prediction long future horizons challenging great interest practitioners academics. well-known intelligent algorithm, standard formulation support vector regression (svr) could taken multi-step-ahead time series prediction, relying either iterated strategy direct strategy. study proposes novel multiple-step-ahead time series prediction approach employs multiple-output support vector regression (m-svr) multiple-input multiple-output (mimo) prediction strategy. addition, rank three leading prediction strategies svr comparatively examined, providing practical implications selection prediction strategy multi-step-ahead forecasting taking svr modeling technique. proposed approach validated simulated real datasets. quantitative comprehensive assessments performed basis prediction accuracy computational cost. results indicate that: 1) m-svr using mimo strategy achieves best accurate forecasts accredited computational load, 2) standard svr using direct strategy achieves second best accurate forecasts, expensive computational cost, 3) standard svr using iterated strategy worst terms prediction accuracy, least computational cost.",4 "role zero synapses unsupervised feature learning. synapses real neural circuits take discrete values, including zero (silent potential) synapses. computational role zero synapses unsupervised feature learning unlabeled noisy data still unclear, thus important understand sparseness synaptic activity shaped learning relationship receptive field formation. here, formulate kind sparse feature learning statistical mechanics approach. find learning decreases fraction zero synapses, fraction decreases rapidly around critical data size, intrinsically structured receptive field starts develop. increasing data size refines receptive field, small fraction zero synapses remain act contour detectors. phenomenon discovered learning handwritten digits dataset, also learning retinal neural activity measured natural-movie-stimuli experiment.",16 "causal network inference via group sparse regularization. paper addresses problem inferring sparse causal networks modeled multivariate auto-regressive (mar) processes. conditions derived group lasso (glasso) procedure consistently estimates sparse network structure. key condition involves ""false connection score."" particular, show consistent recovery possible even number observations network far less number parameters describing network, provided false connection score less one. false connection score also demonstrated useful metric recovery non-asymptotic regimes. conditions suggest modified glasso procedure tends improve false connection score reduce chances reversing direction causal influence. computational experiments real network based electrocorticogram (ecog) simulation study demonstrate effectiveness approach.",19 "infochemical core. vocalizations less often gestures object linguistic research decades. however, development general theory communication human language particular case requires clear understanding organization communication means. infochemicals chemical compounds carry information employed small organisms cannot emit acoustic signals optimal frequency achieve successful communication. distribution infochemicals across species investigated ranked degree number species associated (because produce sensitive it). quality fit different functions dependency degree rank evaluated penalty number parameters function. surprisingly, double zipf (a zipf distribution two regimes different exponent each) model yielding best fit although function largest number parameters. suggests world wide repertoire infochemicals contains chemical nucleus shared many species reminiscent core vocabularies found human language dictionaries large corpora.",16 "deep reconstruction-classification networks unsupervised domain adaptation. paper, propose novel unsupervised domain adaptation algorithm based deep learning visual object recognition. specifically, design new model called deep reconstruction-classification network (drcn), jointly learns shared encoding representation two tasks: i) supervised classification labeled source data, ii) unsupervised reconstruction unlabeled target data.in way, learnt representation preserves discriminability, also encodes useful information target domain. new drcn model optimized using backpropagation similarly standard neural networks. evaluate performance drcn series cross-domain object recognition tasks, drcn provides considerable improvement (up ~8% accuracy) prior state-of-the-art algorithms. interestingly, also observe reconstruction pipeline drcn transforms images source domain images whose appearance resembles target dataset. suggests drcn's performance due constructing single composite representation encodes information structure target images classification source images. finally, provide formal analysis justify algorithm's objective domain adaptation context.",4 "continuous time dynamic topic models. paper, develop continuous time dynamic topic model (cdtm). cdtm dynamic topic model uses brownian motion model latent topics sequential collection documents, ""topic"" pattern word use expect evolve course collection. derive efficient variational approximate inference algorithm takes advantage sparsity observations text, property lets us easily handle many time points. contrast cdtm, original discrete-time dynamic topic model (ddtm) requires time discretized. moreover, complexity variational inference ddtm grows quickly time granularity increases, drawback limits fine-grained discretization. demonstrate cdtm two news corpora, reporting predictive perplexity novel task time stamp prediction.",4 speeding-up decision making learning agent using ion trap quantum processor. report proof-of-principle experimental demonstration quantum speed-up learning agents utilizing small-scale quantum information processor based radiofrequency-driven trapped ions. decision-making process quantum learning agent within projective simulation paradigm machine learning implemented system two qubits. latter realized using hyperfine states two frequency-addressed atomic ions exposed static magnetic field gradient. show deliberation time quantum learning agent quadratically improved respect comparable classical learning agents. performance quantum-enhanced learning agent highlights potential scalable quantum processors taking advantage machine learning.,18 "material classification wild: synthesized training data generalise better real-world training data?. question dominant role real-world training images field material classification investigating whether synthesized data generalise effectively real-world data. experimental results three challenging real-world material databases show best performing pre-trained convolutional neural network (cnn) architectures achieve 91.03% mean average precision classifying materials cross-dataset scenarios. demonstrate synthesized data achieve improvement mean average precision used training data conjunction pre-trained cnn architectures, spans ~ 5% ~ 19% across three widely used material databases real-world images.",4 "linear algorithm digital euclidean connected skeleton. skeleton essential shape characteristic providing compact representation studied shape. computation image grid raises many issues. due effects discretization, required properties skeleton - thinness, homotopy shape, reversibility, connectivity - may become incompatible. however, regards practical use, choice specific skeletonization algorithm depends application. allows classify desired properties order importance, tend towards critical ones. goal make skeleton dedicated shape matching recognition. so, discrete skeleton thin - represented graph -, robust noise, reversible - initial shape fully reconstructed - homotopic shape. propose linear-time skeletonization algorithm based squared euclidean distance map extract maximal balls ridges. thinning pruning process, obtain skeleton. proposed method finally compared fairly recent methods.",4 "outlying property detection numerical attributes. outlying property detection problem problem discovering properties distinguishing given object, known advance outlier database, database objects. paper, analyze problem within context numerical attributes taken account, represents relevant case left open literature. introduce measure quantify degree outlierness object, associated relative likelihood value, compared relative likelihood objects database. major contribution, present efficient algorithm compute outlierness relative significant subsets data. latter subsets characterized ""rule-based"" fashion, hence basis underlying explanation outlierness.",4 "relax localize: value algorithms. show principled way deriving online learning algorithms minimax analysis. various upper bounds minimax value, previously thought non-constructive, shown yield algorithms. allows us seamlessly recover known methods derive new ones. framework also captures ""unorthodox"" methods follow perturbed leader r^2 forecaster. emphasize understanding inherent complexity learning problem leads development algorithms. define local sequential rademacher complexities associated algorithms allow us obtain faster rates online learning, similarly statistical learning theory. based localized complexities build general adaptive method take advantage suboptimality observed sequence. present number new algorithms, including family randomized methods use idea ""random playout"". several new versions follow-the-perturbed-leader algorithms presented, well methods based littlestone's dimension, efficient methods matrix completion trace norm, algorithms problems transductive learning prediction static experts.",4 "traffic sign classification using deep inception based convolutional networks. work, propose novel deep network traffic sign classification achieves outstanding performance gtsrb surpassing previous methods. deep network consists spatial transformer layers modified version inception module specifically designed capturing local global features together. features adoption allows network classify precisely intraclass samples even deformations. use spatial transformer layer makes network robust deformations translation, rotation, scaling input images. unlike existing approaches developed hand-crafted features, multiple deep networks huge parameters data augmentations, method addresses concern exploding parameters augmentations. achieved state-of-the-art performance 99.81\% gtsrb dataset.",4 "detecting blackholes volcanoes directed networks. paper, formulate novel problem finding blackhole volcano patterns large directed graph. specifically, blackhole pattern group made set nodes way inlinks group rest nodes graph. contrast, volcano pattern group outlinks rest nodes graph. patterns observed real world. instance, trading network, blackhole pattern may represent group traders manipulating market. paper, first prove blackhole mining problem dual problem finding volcanoes. therefore, focus finding blackhole patterns. along line, design two pruning schemes guide blackhole finding process. first pruning scheme, strategically prune search space based set pattern-size-independent pruning rules develop iblackhole algorithm. second pruning scheme follows divide-and-conquer strategy exploit pruning results first pruning scheme. indeed, target directed graphs divided several disconnected subgraphs first pruning scheme, thus blackhole finding conducted disconnected subgraph rather large graph. based two pruning schemes, also develop iblackhole-dc algorithm. finally, experimental results real-world data show iblackhole-dc algorithm several orders magnitude faster iblackhole algorithm, huge computational advantage brute-force method.",4 "parallel chromatic mcmc spatial partitioning. introduce novel approach parallelizing mcmc inference models spatially determined conditional independence relationships, existing techniques exploiting graphical model structure applicable. approach motivated model seismic events signals, events detected distant regions approximately independent given intermediate regions. perform parallel inference coloring factor graph defined regions latent space, rather individual model variables. evaluating model seismic event detection, achieve significant speedups serial mcmc degradation inference quality.",19 "gradient estimation using stochastic computation graphs. variety problems originating supervised, unsupervised, reinforcement learning, loss function defined expectation collection random variables, might part probabilistic model external world. estimating gradient loss function, using samples, lies core gradient-based learning algorithms problems. introduce formalism stochastic computation graphs---directed acyclic graphs include deterministic functions conditional probability distributions---and describe easily automatically derive unbiased estimator loss function's gradient. resulting algorithm computing gradient estimator simple modification standard backpropagation algorithm. generic scheme propose unifies estimators derived variety prior work, along variance-reduction techniques therein. could assist researchers developing intricate models involving combination stochastic deterministic operations, enabling, example, attention, memory, control actions.",4 "causal discovery binary exclusive-or skew acyclic model: bexsam. discovering causal relations among observed variables given data set major objective studies statistics artificial intelligence. recently, techniques discover unique causal model explored based non-gaussianity observed data distribution. however, limited continuous data. paper, present novel causal model binary data propose efficient new approach deriving unique causal model governing given binary data set skew distributions external binary noises. experimental evaluation shows excellent performance artificial real world data sets.",19 "total-order partial-order planning: comparative analysis. many years, intuitions underlying partial-order planning largely taken granted. past years renewed interest fundamental principles underlying paradigm. paper, present rigorous comparative analysis partial-order total-order planning focusing two specific planners directly compared. show subtle assumptions underly wide-spread intuitions regarding supposed efficiency partial-order planning. instance, superiority partial-order planning depend critically upon search strategy structure search space. understanding underlying assumptions crucial constructing efficient planners.",4 "guaranteed clustering biclustering via semidefinite programming. identifying clusters similar objects data plays significant role wide range applications. model problem clustering, consider densest k-disjoint-clique problem, whose goal identify collection k disjoint cliques given weighted complete graph maximizing sum densities complete subgraphs induced cliques. paper, establish conditions ensuring exact recovery densest k cliques given graph optimal solution particular semidefinite program. particular, semidefinite relaxation exact input graphs corresponding data consisting k large, distinct clusters smaller number outliers. approach also yields semidefinite relaxation biclustering problem similar recovery guarantees. given set objects set features exhibited objects, biclustering seeks simultaneously group objects features according expression levels. problem may posed partitioning nodes weighted bipartite complete graph sum densities resulting bipartite complete subgraphs maximized. analysis densest k-disjoint-clique problem, show correct partition objects features recovered optimal solution semidefinite program case given data consists several disjoint sets objects exhibiting similar features. empirical evidence numerical experiments supporting theoretical guarantees also provided.",12 "used neural networks detect clickbaits: believe happened next!. online content publishers often use catchy headlines articles order attract users websites. headlines, popularly known clickbaits, exploit user's curiosity gap lure click links often disappoint them. existing methods automatically detecting clickbaits rely heavy feature engineering domain knowledge. here, introduce neural network architecture based recurrent neural networks detecting clickbaits. model relies distributed word representations learned large unannotated corpora, character embeddings learned via convolutional neural networks. experimental results dataset news headlines show model outperforms existing techniques clickbait detection accuracy 0.98 f1-score 0.98 roc-auc 0.99.",4 "learning action models: qualitative approach. dynamic epistemic logic, actions described using action models. paper introduce framework studying learnability action models observations. present first results concerning propositional action models. first check two basic learnability criteria: finite identifiability (conclusively inferring appropriate action model finite time) identifiability limit (inconclusive convergence right action model). show deterministic actions finitely identifiable, non-deterministic actions require learning power-they identifiable limit. move particular learning method, proceeds via restriction space events within learning-specific action model. way learning closely resembles well-known update method dynamic epistemic logic. introduce several different learning methods suited finite identifiability particular types deterministic actions.",4 "learning decode linear codes using deep learning. novel deep learning method improving belief propagation algorithm proposed. method generalizes standard belief propagation algorithm assigning weights edges tanner graph. edges trained using deep learning techniques. well-known property belief propagation algorithm independence performance transmitted codeword. crucial property new method decoder preserved property. furthermore, property allows us learn single codeword instead exponential number code-words. improvements belief propagation algorithm demonstrated various high density parity check codes.",4 "comparing deep neural networks humans: object recognition signal gets weaker. human visual object recognition typically rapid seemingly effortless, well largely independent viewpoint object orientation. recently, animate visual systems ones capable remarkable computational feat. changed rise class computer vision algorithms called deep neural networks (dnns) achieve human-level classification performance object recognition tasks. furthermore, growing number studies report similarities way dnns human visual system process objects, suggesting current dnns may good models human visual object recognition. yet clearly exist important architectural processing differences state-of-the-art dnns primate visual system. potential behavioural consequences differences well understood. aim address issue comparing human dnn generalisation abilities towards image degradations. find human visual system robust image manipulations like contrast reduction, additive noise novel eidolon-distortions. addition, find progressively diverging classification error-patterns man dnns signal gets weaker, indicating may still marked differences way humans current dnns perform visual object recognition. envision findings well carefully measured freely available behavioural datasets provide new useful benchmark computer vision community improve robustness dnns motivation neuroscientists search mechanisms brain could facilitate robustness.",4 "variational recurrent neural machine translation. partially inspired successful applications variational recurrent neural networks, propose novel variational recurrent neural machine translation (vrnmt) model paper. different variational nmt, vrnmt introduces series latent random variables model translation procedure sentence generative way, instead single latent variable. specifically, latent random variables included hidden states nmt decoder elements variational autoencoder. way, variables recurrently generated, enables capture strong complex dependencies among output translations different timesteps. order deal challenges performing efficient posterior inference large-scale training incorporation latent variables, build neural posterior approximator, equip reparameterization technique estimate variational lower bound. experiments chinese-english english-german translation tasks demonstrate proposed model achieves significant improvements conventional variational nmt models.",4 "stochastic generative hashing. learning-based binary hashing become powerful paradigm fast search retrieval massive databases. however, due requirement discrete outputs hash functions, learning functions known challenging. addition, objective functions adopted existing hashing techniques mostly chosen heuristically. paper, propose novel generative approach learn hash functions minimum description length principle learned hash codes maximally compress dataset also used regenerate inputs. also develop efficient learning algorithm based stochastic distributional gradient, avoids notorious difficulty caused binary output constraints, jointly optimize parameters hash function associated generative model. extensive experiments variety large-scale datasets show proposed method achieves better retrieval results existing state-of-the-art methods.",4 "using mechanical turk build machine translation evaluation sets. building machine translation (mt) test sets relatively expensive task. mt becomes increasingly desired language pairs domains, becomes necessary build test sets case. paper, investigate using amazon's mechanical turk (mturk) make mt test sets cheaply. find mturk used make test sets much cheaper professionally-produced test sets. importantly, experiments multiple mt systems, find mturk-produced test sets yield essentially conclusions regarding system performance professionally-produced test sets yield.",4 "marginal simultaneous predictive classification using stratified graphical models. inductive probabilistic classification rule must generally obey principles bayesian predictive inference, observed unobserved stochastic quantities jointly modeled parameter uncertainty fully acknowledged posterior predictive distribution. several rules recently considered asymptotic behavior characterized assumption observed features variables used building classifier conditionally independent given simultaneous labeling training samples unknown origin. extend theoretical results predictive classifiers acknowledging feature dependencies either graphical models sparser alternatives defined stratified graphical models. also show experimentation synthetic real data predictive classifiers based stratified graphical models consistently best accuracy compared predictive classifiers based either conditionally independent features ordinary graphical models.",19 "end-to-end weakly-supervised semantic alignment. tackle task semantic alignment goal compute dense semantic correspondence aligning two images depicting objects category. challenging task due large intra-class variation, changes viewpoint background clutter. present following three principal contributions. first, develop convolutional neural network architecture semantic alignment trainable end-to-end manner weak image-level supervision form matching image pairs. outcome parameters learnt rich appearance variation present different semantically related images without need tedious manual annotation correspondences training time. second, main component architecture differentiable soft inlier scoring module, inspired ransac inlier scoring procedure, computes quality alignment based geometrically consistent correspondences thereby reducing effect background clutter. third, demonstrate proposed approach achieves state-of-the-art performance multiple standard benchmarks semantic alignment.",4 "shattered gradients problem: resnets answer, question?. long-standing obstacle progress deep learning problem vanishing exploding gradients. problem largely overcome introduction carefully constructed initializations batch normalization. nevertheless, architectures incorporating skip-connections resnets perform much better standard feedforward architectures despite well-chosen initialization batch normalization. paper, identify shattered gradients problem. specifically, show correlation gradients standard feedforward networks decays exponentially depth resulting gradients resemble white noise. contrast, gradients architectures skip-connections far resistant shattering decaying sublinearly. detailed empirical evidence presented support analysis, fully-connected networks convnets. finally, present new ""looks linear"" (ll) initialization prevents shattering. preliminary experiments show new initialization allows train deep networks without addition skip-connections.",4 "proceedings fifth workshop developments computational models--computational models nature. special theme dcm 2009, co-located icalp 2009, concerned computational models nature, particular emphasis computational models derived physics biology. intention bring together different approaches - community strong foundational background proffered icalp attendees - create inspirational cross-boundary exchanges, lead innovative research. specifically dcm 2009 sought contributions quantum computation information, probabilistic models, chemical, biological bio-inspired ones, including spatial models, growth models models self-assembly. contributions putting test logical algorithmic aspects computing (e.g., continuous computing dynamical systems, solid state computing models) also much welcomed.",4 "diachronic word embeddings reveal statistical laws semantic change. understanding words change meanings time key models language cultural evolution, historical data meaning scarce, making theories hard develop test. word embeddings show promise diachronic tool, carefully evaluated. develop robust methodology quantifying semantic change evaluating word embeddings (ppmi, svd, word2vec) known historical changes. use methodology reveal statistical laws semantic evolution. using six historical corpora spanning four languages two centuries, propose two quantitative laws semantic change: (i) law conformity---the rate semantic change scales inverse power-law word frequency; (ii) law innovation---independent frequency, words polysemous higher rates semantic change.",4 "classification approaches challenges frequent subgraphs mining biological networks. understanding structure dynamics biological networks one important challenges system biology. addition, increasing amount experimental data biological networks necessitate use efficient methods analyze huge amounts data. methods require recognize common patterns analyze data. biological networks modeled graphs, problem common patterns recognition equivalent frequent sub graph mining set graphs. paper, first challenges frequent subgrpahs mining biological networks introduced existing approaches classified challenge. algorithms analyzed basis type approach apply challenges.",4 "weakly submodular maximization beyond cardinality constraints: randomization help greedy?. submodular functions broad class set functions, naturally arise diverse areas. many algorithms suggested maximization functions. unfortunately, function deviates submodularity, known algorithms may perform arbitrarily poorly. amending issue, obtaining approximation results set functions generalizing submodular functions, focus recent works. one class, known weakly submodular functions, received lot attention. key result proved das kempe (2011) showed approximation ratio greedy algorithm weakly submodular maximization subject cardinality constraint degrades smoothly distance submodularity. however, results obtained maximization subject constraints beyond cardinality. particular, known whether greedy algorithm achieves non-trivial approximation ratio constraints. paper, prove randomized version greedy algorithm (previously used buchbinder et al. (2014) different problem) achieves approximation ratio $(1 + 1/\gamma)^{-2}$ maximization weakly submodular function subject general matroid constraint, $\gamma$ parameter measuring distance function submodularity. moreover, also experimentally compare performance version greedy algorithm real world problems natural benchmarks, show algorithm study performs well also practice. best knowledge, first algorithm non-trivial approximation guarantee maximizing weakly submodular function subject constraint simple cardinality constraint. particular, first algorithm guarantee important broad class matroid constraints.",4 "visualizing loss landscape neural nets. neural network training relies ability find ""good"" minimizers highly non-convex loss functions. well known certain network architecture designs (e.g., skip connections) produce loss functions train easier, well-chosen training parameters (batch size, learning rate, optimizer) produce minimizers generalize better. however, reasons differences, effect underlying loss landscape, well understood. paper, explore structure neural loss functions, effect loss landscapes generalization, using range visualization methods. first, introduce simple ""filter normalization"" method helps us visualize loss function curvature, make meaningful side-by-side comp arisons loss functions. then, using variety visualizations, explore network architecture affects loss landscape, training parameters affect shape minimizers.",4 "zipf's law emerges asymptotically phase transitions communicative systems. zipf's law predicts power-law relationship word rank frequency language communication systems, widely reported texts yet remains enigmatic origins. computer simulations shown language communication systems emerge abrupt phase transition fidelity mappings symbols objects. since phase transition approximates heaviside step function, show zipfian scaling emerges asymptotically high rank based laplace transform. thereby demonstrate zipf's law gradually emerges moment phase transition communicative systems. show power-law scaling behavior explains emergence natural languages phase transitions. find emergence zipf's law language communication suggests use rare words lexicon critical construction effective communicative system phase transition.",15 "drunet: dilated-residual u-net deep learning network digitally stain optic nerve head tissues optical coherence tomography images. given neural connective tissues optic nerve head (onh) exhibit complex morphological changes development progression glaucoma, simultaneous isolation optical coherence tomography (oct) images may great interest clinical diagnosis management pathology. deep learning algorithm designed trained digitally stain (i.e. highlight) 6 onh tissue layers capturing local (tissue texture) contextual information (spatial arrangement tissues). overall dice coefficient (mean tissues) $0.91 \pm 0.05$ assessed manual segmentations performed expert observer. offer robust segmentation framework could extended automated parametric study onh tissues.",4 "chromatag: colored marker fast detection algorithm. current fiducial marker detection algorithms rely marker ids false positive rejection. time wasted potential detections eventually rejected false positives. introduce chromatag, fiducial marker detection algorithm designed use opponent colors limit quickly reject initial false detections grayscale precise localization. experiments, show chromatag significantly faster current fiducial markers achieving similar better detection accuracy. also show tag size viewing direction effect detection accuracy. contribution significant fiducial markers often used real-time applications (e.g. marker assisted robot navigation) heavy computation required parts system.",4 "end-to-end learning action detection frame glimpses videos. work introduce fully end-to-end approach action detection videos learns directly predict temporal bounds actions. intuition process detecting actions naturally one observation refinement: observing moments video, refining hypotheses action occurring. based insight, formulate model recurrent neural network-based agent interacts video time. agent observes video frames decides look next emit prediction. since backpropagation adequate non-differentiable setting, use reinforce learn agent's decision policy. model achieves state-of-the-art results thumos'14 activitynet datasets observing fraction (2% less) video frames.",4 "rule-based emotion detection social media: putting tweets plutchik's wheel. study sentiment analysis beyond typical granularity polarity instead use plutchik's wheel emotions model. introduce rbem-emo extension rule-based emission model algorithm deduce emotions human-written messages. evaluate approach two different datasets compare performance current state-of-the-art techniques emotion detection, including recursive auto-encoder. results experimental study suggest rbem-emo promising approach advancing current state-of-the-art emotion detection.",4 "stochastic dual coordinate ascent methods regularized loss minimization. stochastic gradient descent (sgd) become popular solving large scale supervised machine learning optimization problems svm, due strong theoretical guarantees. closely related dual coordinate ascent (dca) method implemented various software packages, far lacked good convergence analysis. paper presents new analysis stochastic dual coordinate ascent (sdca) showing class methods enjoy strong theoretical guarantees comparable better sgd. analysis justifies effectiveness sdca practical applications.",19 "learning polynomial networks classification clinical electroencephalograms. describe polynomial network technique developed learning classify clinical electroencephalograms (eegs) presented noisy features. using evolutionary strategy implemented within group method data handling, learn classification models comprehensively described sets short-term polynomials. polynomial models learnt classify eegs recorded alzheimer healthy patients recognize eeg artifacts. comparing performances technique machine learning methods conclude technique learn well-suited polynomial models experts find easy-to-understand.",4 "learning attend, copy, generate session-based query suggestion. users try articulate complex information needs search sessions reformulating queries. make process effective, search engines provide related queries help users specifying information need search process. paper, propose customized sequence-to-sequence model session-based query suggestion. model, employ query-aware attention mechanism capture structure session context. enables us control scope session infer suggested next query, helps handle noisy data also automatically detect session boundaries. furthermore, observe that, based user query reformulation behavior, within single session large portion query terms retained previously submitted queries consists mostly infrequent unseen terms usually included vocabulary. therefore empower decoder model access source words session context decoding incorporating copy mechanism. moreover, propose evaluation metrics assess quality generative models query suggestion. conduct extensive set experiments analysis. e results suggest model outperforms baselines terms generating queries scoring candidate queries task query suggestion.",4 "author-topic model authors documents. introduce author-topic model, generative model documents extends latent dirichlet allocation (lda; blei, ng, & jordan, 2003) include authorship information. author associated multinomial distribution topics topic associated multinomial distribution words. document multiple authors modeled distribution topics mixture distributions associated authors. apply model collection 1,700 nips conference papers 160,000 citeseer abstracts. exact inference intractable datasets use gibbs sampling estimate topic author distributions. compare performance two generative models documents, special cases author-topic model: lda (a topic model) simple author model author associated distribution words rather distribution topics. show topics recovered author-topic model, demonstrate applications computing similarity authors entropy author output.",4 "equivalence distance-based rkhs-based statistics hypothesis testing. provide unifying framework linking two classes statistics used two-sample independence testing: one hand, energy distances distance covariances statistics literature; other, maximum mean discrepancies (mmd), is, distances embeddings distributions reproducing kernel hilbert spaces (rkhs), established machine learning. case energy distance computed semimetric negative type, positive definite kernel, termed distance kernel, may defined mmd corresponds exactly energy distance. conversely, positive definite kernel, interpret mmd energy distance respect negative-type semimetric. equivalence readily extends distance covariance using kernels product space. determine class probability distributions test statistics consistent alternatives. finally, investigate performance family distance kernels two-sample independence tests: show particular energy distance commonly employed statistics one member parametric family kernels, choices family yield powerful tests.",19 "predicting co-evolution event knowledge graphs. embedding learning, a.k.a. representation learning, shown able model large-scale semantic knowledge graphs. key concept mapping knowledge graph tensor representation whose entries predicted models using latent representations generalized entities. knowledge graphs typically treated static: knowledge graph grows links facts become available ground truth values associated links considered time invariant. paper address issue knowledge graphs triple states depend time. assume changes knowledge graph always arrive form events, sense events gateway knowledge graph. train event prediction model uses knowledge graph background information information recent events. predicting future events, also predict likely changes knowledge graph thus obtain model evolution knowledge graph well. experiments demonstrate approach performs well clinical application, recommendation engine sensor network application.",4 "classification approach based association rules mining unbalanced data. paper deals binary classification task target class lower probability occurrence. situation, possible build powerful classifier using standard methods logistic regression, classification tree, discriminant analysis, etc. overcome short-coming methods yield classifiers low sensibility, tackled classification problem approach based association rules learning. approach advantage allowing identification patterns well correlated target class. association rules learning well known method area data-mining. used dealing large database unsupervised discovery local patterns expresses hidden relationships input variables. considering association rules supervised learning point view, relevant set weak classifiers obtained one derives classifier performs well.",19 "valid optimal assignment kernels applications graph classification. success kernel methods initiated design novel positive semidefinite functions, particular structured data. leading design paradigm convolution kernel, decomposes structured objects parts sums pairs parts. assignment kernels, contrast, obtained optimal bijection parts, provide valid notion similarity. general however, optimal assignments yield indefinite functions, complicates use kernel methods. characterize class base kernels used compare parts guarantees positive semidefinite optimal assignment kernels. base kernels give rise hierarchies optimal assignment kernels computed linear time histogram intersection. apply results developing weisfeiler-lehman optimal assignment kernel graphs. provides high classification accuracy widely-used benchmark data sets improving original weisfeiler-lehman kernel.",4 "towards label imbalance multi-label classification many labels. multi-label classification, instance may associated set labels simultaneously. recently, research multi-label classification largely shifted focus end spectrum number labels assumed extremely large. existing works focus design scalable algorithms offer fast training procedures small memory footprint. however ignore even compound another challenge - label imbalance problem. address drawback, propose novel representation-based multi-label learning sampling (rmls) approach. best knowledge, first tackle imbalance problem multi-label classification many labels. experimentations real-world datasets demonstrate effectiveness proposed approach.",4 "discussion validation tests employed compare human action recognition methods using msr action3d dataset. paper aims determine best human action recognition method based features extracted rgb-d devices, microsoft kinect. review papers make reference msr action3d, used dataset includes depth information acquired rgb-d device, performed. found validation method used work differs others. so, direct comparison among works cannot made. however, almost works present results comparing without taking account issue. therefore, present different rankings according methodology used validation orden clarify existing confusion.",4 "ant colony algorithm weighted item layout optimization problem. paper discusses problem placing weighted items circular container two-dimensional space. problem great practical significance various mechanical engineering domains, design communication satellites. two constructive heuristics proposed, one packing circular items packing rectangular items. work first optimizing object placement order, optimizing object positioning. based heuristics, ant colony optimization (aco) algorithm described search first optimal positioning order, optimal layout. describe results numerical experiments, test two versions aco algorithm alongside local search methods previously described literature. results show constructive heuristic-based aco performs better existing methods larger problem instances.",4 "mga trajectory planning aco-inspired algorithm. given set celestial bodies, problem finding optimal sequence swing-bys, deep space manoeuvres (dsm) transfer arcs connecting elements set combinatorial nature. number possible paths grows exponentially number celestial bodies. therefore, design optimal multiple gravity assist (mga) trajectory np-hard mixed combinatorial-continuous problem. automated solution would greatly improve design future space missions, allowing assessment large number alternative mission options short time. work proposes formulate complete automated design multiple gravity assist trajectory autonomous planning scheduling problem. resulting scheduled plan provide optimal planetary sequence good estimation set associated optimal trajectories. trajectory model consists sequence celestial bodies connected twodimensional transfer arcs containing one dsm. transfer arc, position planet spacecraft, time arrival, matched varying pericentre preceding swing-by, magnitude launch excess velocity, first arc. departure date, model generates full tree possible transfers departure destination planet. leaf tree represents planetary encounter possible way reach planet. algorithm inspired ant colony optimization (aco) devised explore space possible plans. ants explore tree departure destination adding one node time: every time ant node, probability function used select feasible direction. approach automatic trajectory planning applied design optimal transfers saturn among galilean moons jupiter.",4 "first-order methods almost always avoid saddle points. establish first-order methods avoid saddle points almost initializations. results apply wide variety first-order methods, including gradient descent, block coordinate descent, mirror descent variants thereof. connecting thread algorithms studied dynamical systems perspective appropriate instantiations stable manifold theorem allow global stability analysis. thus, neither access second-order derivative information randomness beyond initialization necessary provably avoid saddle points.",19 "parallel tracking verifying: framework real-time high accuracy visual tracking. intensively studied, visual tracking seen great recent advances either speed (e.g., correlation filters) accuracy (e.g., deep features). real-time high accuracy tracking algorithms, however, remain scarce. paper study problem new perspective present novel parallel tracking verifying (ptav) framework, taking advantage ubiquity multi-thread techniques borrowing success parallel tracking mapping visual slam. ptav framework typically consists two components, tracker verifier v, working parallel two separate threads. tracker aims provide super real-time tracking inference expected perform well time; contrast, verifier v checks tracking results corrects needed. key innovation that, v work every frame upon requests t; end, may adjust tracking according feedback v. collaboration, ptav enjoys high efficiency provided strong discriminative power v. extensive experiments popular benchmarks including otb2013, otb2015, tc128 uav20l, ptav achieves best tracking accuracy among real-time trackers, fact performs even better many deep learning based solutions. moreover, general framework, ptav flexible great rooms improvement generalization.",4 "dual supervised learning. many supervised learning tasks emerged dual forms, e.g., english-to-french translation vs. french-to-english translation, speech recognition vs. text speech, image classification vs. image generation. two dual tasks intrinsic connections due probabilistic correlation models. connection is, however, effectively utilized today, since people usually train models two dual tasks separately independently. work, propose training models two dual tasks simultaneously, explicitly exploiting probabilistic correlation regularize training process. ease reference, call proposed approach \emph{dual supervised learning}. demonstrate dual supervised learning improve practical performances tasks, various applications including machine translation, image processing, sentiment analysis.",4 "echo state condition critical point. recurrent networks transfer functions fulfill lipschitz continuity k=1 may echo state networks certain limitations recurrent connectivity applied. shown sufficient largest singular value recurrent connectivity smaller 1. main achievement paper proof conditions network echo state network even largest singular value one. turns critical case exact shape transfer function plays decisive role determining whether network still fulfills echo state condition. addition, several examples one neuron networks outlined illustrate effects critical connectivity. moreover, within manuscript mathematical definition critical echo state network suggested.",4 "random feedback weights support learning deep neural networks. brain processes information many layers neurons. deep architecture representationally powerful, complicates learning making hard identify responsible neurons mistake made. machine learning, backpropagation algorithm assigns blame neuron computing exactly contributed error. this, multiplies error signals matrices consisting synaptic weights neuron's axon farther downstream. operation requires precisely choreographed transport synaptic weight information, thought impossible brain. present surprisingly simple algorithm deep learning, assigns blame multiplying error signals random synaptic weights. show network learn extract useful information signals sent random feedback connections. essence, network learns learn. demonstrate new mechanism performs quickly accurately backpropagation variety problems describe principles underlie function. demonstration provides plausible basis neuron adapted using error signals generated distal locations brain, thus dispels long-held assumptions algorithmic constraints learning neural circuits.",16 "horn: system parallel training regularizing large-scale neural networks. introduce new distributed system effective training regularizing large-scale neural networks distributed computing architectures. experiments demonstrate effectiveness flexible model partitioning parallelization strategies based neuron-centric computation model, implementation collective parallel dropout neural networks training. experiments performed mnist handwritten digits classification including results.",4 "complexity curve fitting algorithms. study popular algorithm fitting polynomial curves scattered data based least squares gradient weights. show sometimes algorithm admits substantial reduction complexity, and, furthermore, find precise conditions possible. turns is, indeed, possible one fits circles ellipses hyperbolas.",4 "offline handwritten signature identification using adaptive window positioning techniques. paper presents address challenge, proposed use adaptive window positioning technique focuses meaning handwritten signature also individuality writer. innovative technique divides handwritten signature 13 small windows size nxn(13x13).this size large enough contain ample information style author small enough ensure good identification performance.the process tested gpds data set containing 4870 signature samples 90 different writers comparing robust features test signature user signature using appropriate classifier. experimental results reveal adaptive window positioning technique proved efficient reliable method accurate signature feature extraction identification offline handwritten signatures.the contribution technique used detect signatures signed emotional duress.",4 "language structure n-object naming game. examine naming game two agents trying establish common vocabulary n objects. efforts lead emergence language allows efficient communication exhibits degree homonymy synonymy. although homonymy reduces communication efficiency, seems dynamical trap persists long, perhaps indefinite, time. hand, synonymy reduce efficiency communication, appears transient feature language. thus, model role synonymy decreases long-time limit becomes negligible. similar rareness synonymy observed present natural languages. role noise, distorts communicated words, also examined. although, general, noise reduces communication efficiency, also regroups words evenly distributed within available ""verbal"" space.",4 "planning learning. paper introduces framework planning learning agent given goal achieve environment whose behavior partially known agent. discuss tractability various plan-design processes. show large natural class planning learning systems, plan presented verified reasonable time. however, coming algorithmically plan, even simple classes systems apparently intractable. emphasize role off-line plan-design processes, show that, natural cases, verification (projection) part carried efficient algorithmic manner.",4 "rotation invariance neural network. rotation invariance translation invariance great values image recognition tasks. paper, bring new architecture convolutional neural network (cnn) named cyclic convolutional layer achieve rotation invariance 2-d symbol recognition. also get position orientation 2-d symbol network achieve detection purpose multiple non-overlap target. last least, architecture achieve one-shot learning cases using invariance.",4 "two-stage sampled learning theory distributions. focus distribution regression problem: regressing real-valued response probability distribution. although exist large number similarity measures distributions, little known generalization performance specific learning tasks. learning problems formulated distributions inherent two-stage sampled difficulty: practice samples sampled distributions observable, one build estimate similarities computed sets points. best knowledge, existing method consistency guarantees distribution regression requires kernel density estimation intermediate step (which suffers slow convergence issues high dimensions), domain distributions compact euclidean. paper, provide theoretical guarantees remarkably simple algorithmic alternative solve distribution regression problem: embed distributions reproducing kernel hilbert space, learn ridge regressor embeddings outputs. main contribution prove consistency technique two-stage sampled setting mild conditions (on separable, topological domains endowed kernels). given total number observations, derive convergence rates explicit function problem difficulty. special case, answer 15-year-old open question: establish consistency classical set kernel [haussler, 1999; gartner et. al, 2002] regression, cover recent kernels distributions, including due [christmann steinwart, 2010].",12 "hashing algorithms large-scale learning. paper, first demonstrate b-bit minwise hashing, whose estimators positive definite kernels, naturally integrated learning algorithms svm logistic regression. adopt simple scheme transform nonlinear (resemblance) kernel linear (inner product) kernel; hence large-scale problems solved extremely efficiently. method provides simple effective solution large-scale learning massive extremely high-dimensional datasets, especially data fit memory. compare b-bit minwise hashing vowpal wabbit (vw) algorithm (which related count-min (cm) sketch). interestingly, vw variances random projections. theoretical empirical comparisons illustrate usually $b$-bit minwise hashing significantly accurate (at storage) vw (and random projections) binary data. furthermore, $b$-bit minwise hashing combined vw achieve improvements terms training speed, especially $b$ large.",19 "cross-language framework word recognition spotting indic scripts. handwritten word recognition spotting low-resource scripts difficult sufficient training data available often expensive collecting data scripts. paper presents novel cross language platform handwritten word recognition spotting low-resource scripts training performed sufficiently large dataset available script (considered source script) testing done scripts (considered target script). training one source script testing another script reasonable result easy handwriting domain due complex nature handwriting variability among scripts. also difficult mapping source target characters appear cursive word images. proposed indic cross language framework exploits large resource dataset training uses recognizing spotting text target scripts sufficient amount training data available. since, indic scripts mostly written 3 zones, namely, upper, middle lower, employ zone-wise character (or component) mapping efficient learning purpose. performance cross-language framework depends extent similarity source target scripts. hence, devise entropy based script similarity score using source target character mapping provide feasibility cross language transcription. tested approach three indic scripts, namely, bangla, devanagari gurumukhi, corresponding results reported.",4 "supervised saliency map driven segmentation lesions dermoscopic images. lesion segmentation first step automatic melanoma recognition systems. deficiencies difficulties dermoscopic images make lesion segmentation intricate task e.g., hair occlusion, presence dark corners color charts, indistinct lesion borders, lesions touching image boundaries. order overcome problems, proposed supervised saliency detection method specially tailored dermoscopic images based discriminative regional feature integration (drfi) method. drfi method incorporates multi-level segmentation, regional contrast, property backgroundness descriptors, random forest regressor create saliency scores region image. improved saliency detection method, mdrfi, introduced features regional property descriptors proposed novel pseudo-background region boost performance. overall segmentation framework uses saliency map construct initial mask lesion thresholding post-processing operations. initial mask evolving level set framework fit better lesion boundaries. results evaluation experiments three public datasets show proposed segmentation method outperforms conventional state-of-the-art segmentation algorithms performance comparable recent deep convolutional neural networks based approaches.",4 "exploiting sparsity build efficient kernel based collaborative filtering top-n item recommendation. increasing availability implicit feedback datasets raised interest developing effective collaborative filtering techniques able deal asymmetrically unambiguous positive feedback ambiguous negative feedback. paper, propose principled kernel-based collaborative filtering method top-n item recommendation implicit feedback. present efficient implementation using linear kernel, show generalize kernels dot product family preserving efficiency. also investigate elements influence sparsity standard cosine kernel. analysis shows sparsity kernel strongly depends properties dataset, particular long tail distribution. compare method state-of-the-art algorithms achieving good results terms efficiency effectiveness.",4 "attention-set based metric learning video face recognition. face recognition made great progress development deep learning. however, video face recognition (vfr) still ongoing task due various illumination, low-resolution, pose variations motion blur. existing cnn-based vfr methods obtain feature vector single image simply aggregate features video, less consider correlations face images one video. paper, propose novel attention-set based metric learning (asml) method measure statistical characteristics image sets. promising generalized extension maximum mean discrepancy memory attention weighting. first, define effective distance metric image sets, explicitly minimizes intra-set distance maximizes inter-set distance simultaneously. second, inspired neural turing machine, memory attention weighting proposed adapt set-aware global contents. asml naturally integrated cnns, resulting end-to-end learning scheme. method achieves state-of-the-art performance task video face recognition three widely used benchmarks including youtubeface, youtube celebrities celebrity-1000.",4 "positive definite matrices s-divergence. positive definite matrices abound dazzling variety applications. ubiquity part attributed rich geometric structure: positive definite matrices form self-dual convex cone whose strict interior riemannian manifold. manifold view endowed ""natural"" distance function conic view not. nevertheless, drawing motivation conic view, introduce s-divergence ""natural"" distance-like function open cone positive definite matrices. motivate s-divergence via sequence results connect riemannian distance. particular, show (a) divergence square distance; (b) several geometric properties similar riemannian distance, though without computationally demanding. s-divergence even intriguing: although nonconvex, still compute matrix means medians using global optimality. complement results numerical experiments illustrating theorems optimization algorithm computing matrix medians.",12 "using natural language processing screen patients active heart failure: exploration hospital-wide surveillance. paper, proposed two different approaches, rule-based approach machine-learning based approach, identify active heart failure cases automatically analyzing electronic health records (ehr). rule-based approach, extracted cardiovascular data elements clinical notes matched patients different colors according heart failure condition using rules provided experts heart failure. achieved 69.4% accuracy 0.729 f1-score. machine learning approach, bigram clinical notes features, tried four different models svm linear kernel achieved best performance 87.5% accuracy 0.86 f1-score. also, classification comparison four different models, believe linear models fit better problem. combine machine-learning rule-based algorithms, enable hospital-wide surveillance active heart failure increased accuracy interpretability outputs.",4 "neural pca deep unsupervised learning. network supporting deep unsupervised learning presented. network autoencoder lateral shortcut connections encoder decoder level hierarchy. lateral shortcut connections allow higher levels hierarchy focus abstract invariant features. standard autoencoders analogous latent variable models single layer stochastic variables, proposed network analogous hierarchical latent variables models. learning combines denoising autoencoder denoising sources separation frameworks. layer network contributes cost function term measures distance representations produced encoder decoder. since training signals originate levels network, layers learn efficiently even deep networks. speedup offered cost terms higher levels hierarchy ability learn invariant features demonstrated experiments.",19 "image pixel fusion human face recognition. paper present technique fusion optical thermal face images based image pixel fusion approach. several factors, affect face recognition performance case visual images, illumination changes significant factor needs addressed. thermal images better handling illumination conditions consistent capturing texture details faces. factors like sunglasses, beard, moustache etc also play active role adding complicacies recognition process. fusion thermal visual images solution overcome drawbacks present individual thermal visual face images. fused images projected eigenspace projected images classified using radial basis function (rbf) neural network also multi-layer perceptron (mlp). experiments object tracking classification beyond visible spectrum (otcbvs) database benchmark thermal visual face images used. comparison experimental results show proposed approach performs significantly well recognizing face images success rate 96% 95.07% rbf neural network mlp respectively.",4 "god(s) know(s): developmental cross-cultural patterns children drawings. paper introduces novel approach data analysis designed needs specialists psychology religion. detect developmental cross-cultural patterns children's drawings god(s) supernatural agents. develop methods objectively evaluate empirical observations drawings respect to: (1) gravity center, (2) average intensities colors \emph{green} \emph{yellow}, (3) use different colors (palette) (4) visual complexity drawings. find statistically significant differences across ages countries gravity centers average intensities colors. findings support hypotheses experts raise new questions investigation.",4 "linear time natural evolution strategy non-separable functions. present novel natural evolution strategy (nes) variant, rank-one nes (r1-nes), uses low rank approximation search distribution covariance matrix. algorithm allows computation natural gradient cost linear dimensionality parameter space, excels solving high-dimensional non-separable problems, including best result date rosenbrock function (512 dimensions).",4 "integrating prosodic lexical cues automatic topic segmentation. present probabilistic model uses prosodic lexical cues automatic segmentation speech topically coherent units. propose two methods combining lexical prosodic information using hidden markov models decision trees. lexical information obtained speech recognizer, prosodic features extracted automatically speech waveforms. evaluate approach broadcast news corpus, using darpa-tdt evaluation metrics. results show prosodic model alone competitive word-based segmentation methods. furthermore, achieve significant reduction error combining prosodic word-based knowledge sources.",4 "stable recovery sparse vectors random sinusoidal feature maps. random sinusoidal features popular approach speeding kernel-based inference large datasets. prior inference stage, approach suggests performing dimensionality reduction first multiplying data vector random gaussian matrix, computing element-wise sinusoid. theoretical analysis shows collecting sufficient number features reliably used subsequent inference kernel classification regression. work, demonstrate mild increase dimension embedding, also possible reconstruct data vector random sinusoidal features, provided underlying data sparse enough. particular, propose numerically stable algorithm reconstructing data vector given nonlinear features, analyze sample complexity. algorithm extended types structured inverse problems, demixing pair sparse (but incoherent) vectors. support efficacy approach via numerical experiments.",19 "possibility neutrosophic soft sets applications decision making similarity measure. paper, concept possibility neutrosophic soft set operations defined, properties studied. application theory decision making investigated. also similarity measure two possibility neutrosophic soft sets introduced discussed. finally application similarity measure given select suitable person position firm.",4 "ensemble methods convex regression applications geometric programming based circuit design. convex regression promising area bridging statistical estimation deterministic convex optimization. new piecewise linear convex regression methods fast scalable, instability used approximate constraints objective functions optimization. ensemble methods, like bagging, smearing random partitioning, alleviate problem maintain theoretical properties underlying estimator. empirically examine performance ensemble methods prediction optimization, apply device modeling constraint approximation geometric programming based circuit design.",4 "combinatorial algorithm compute regularization paths. wide variety regularization methods, algorithms computing entire solution path developed recently. solution path algorithms compute solution one particular value regularization parameter entire path solutions, making selection optimal parameter much easier. currently used algorithms robust sense cannot deal general degenerate input. present new robust, generic method parametric quadratic programming. algorithm directly applies nearly machine learning applications, far every application required different algorithm. illustrate usefulness method applying low rank problem could solved existing path tracking methods, namely compute part-worth values choice based conjoint analysis, popular technique market research estimate consumers preferences class parameterized options.",4 "dvqa: understanding data visualizations via question answering. bar charts effective way humans convey information other, today's algorithms cannot parse them. existing methods fail faced minor variations appearance. here, present dvqa, dataset tests many aspects bar chart understanding question answering framework. unlike visual question answering (vqa), dvqa requires processing words answers unique particular bar chart. state-of-the-art vqa algorithms perform poorly dvqa, propose two strong baselines perform considerably better. work enable algorithms automatically extract semantic information vast quantities literature science, business, areas.",4 "toward robust diversity-based model detect changes context. able automatically quickly understand user context session main issue recommender systems. first step toward achieving goal, propose model observes real time diversity brought item relatively short sequence consultations, corresponding recent user history. model complexity constant time, generic since apply type items within online service (e.g. profiles, products, music tracks) application domain (e-commerce, social network, music streaming), long partial item descriptions. observation diversity level time allows us detect implicit changes. long term, plan characterize context, i.e. find common features among contiguous sub-sequence items two changes context determined model. allow us make context-aware privacy-preserving recommendations, explain users. ongoing research, first step consists studying robustness model detecting changes context. order so, use music corpus 100 users 210,000 consultations (number songs played global history). validate relevancy detections finding connections changes context events, ends session. course, events subset possible changes context, since might several contexts within session. altered quality corpus several manners, test performances model confronted sparsity different types items. results show model robust constitutes promising approach.",4 word segmentation micro-blog texts external lexicon heterogeneous data. paper describes system designed nlpcc 2016 shared task word segmentation micro-blog texts.,4 learning deep structure-preserving image-text embeddings. paper proposes method learning joint embeddings images text using two-branch neural network multiple layers linear projections followed nonlinearities. network trained using large margin objective combines cross-view ranking constraints within-view neighborhood structure preservation constraints inspired metric learning literature. extensive experiments show approach gains significant improvements accuracy image-to-text text-to-image retrieval. method achieves new state-of-the-art results flickr30k mscoco image-sentence datasets shows promise new task phrase localization flickr30k entities dataset.,4 "using atl define advanced flexible constraint model transformations. transforming constraint models important task re- cent constraint programming systems. user-understandable models defined modeling phase rewriting tuning manda- tory get solving-efficient models. propose new architecture al- lowing define bridges (modeling solver) languages implement model optimizations. architecture follows model- driven approach constraint modeling process seen set model transformations. among others, interesting feature def- inition transformations concept-oriented rules, i.e. based types model elements types organized hierarchy called metamodel.",4 "sentiment new york city: high resolution spatial temporal view. measuring public sentiment key task researchers policymakers alike. explosion available social media data allows time-sensitive geographically specific analysis ever before. paper analyze data micro-blogging site twitter generate sentiment map new york city. develop classifier specifically tuned 140-character twitter messages, tweets, using key words, phrases emoticons determine mood tweet. method, combined geotagging provided users, enables us gauge public sentiment extremely fine-grained spatial temporal scales. find public mood generally highest public parks lowest transportation hubs, locate areas strong sentiment cemeteries, medical centers, jail, sewage facility. sentiment progressively improves proximity times square. periodic patterns sentiment fluctuate daily weekly scale: positive tweets posted weekends weekdays, daily peak sentiment around midnight nadir 9:00 a.m. noon.",15 "image forgery localization based multi-scale convolutional neural networks. paper, propose utilize convolutional neural networks (cnns) segmentation-based multi-scale analysis locate tampered areas digital images. first, deal color input sliding windows different scales, unified cnn architecture designed. then, elaborately design training procedures cnns sampled training patches. set robust multi-scale tampering detectors based cnns, complementary tampering possibility maps generated. last least, segmentation-based method proposed fuse maps generate final decision map. exploiting benefits small-scale large-scale analyses, segmentation-based multi-scale analysis lead performance leap forgery localization cnns. numerous experiments conducted demonstrate effectiveness efficiency method.",4 "faster coordinate descent via adaptive importance sampling. coordinate descent methods employ random partial updates decision variables order solve huge-scale convex optimization problems. work, introduce new adaptive rules random selection updates. adaptive, mean selection rules based dual residual primal-dual gap estimates change iteration. theoretically characterize performance selection rules demonstrate improvements state-of-the-art, extend theory algorithms general convex objectives. numerical evidence hinge-loss support vector machines lasso confirm practice follows theory.",4 "semi-bounded rationality: model decision making. paper theory semi-bounded rationality proposed extension theory bounded rationality. particular, proposed decision making process involves two components correlation machine, estimates missing values, causal machine, relates cause effect. rational decision making involves using information almost always imperfect incomplete well intelligent machine human inconsistent make decisions. theory bounded rationality decision made irrespective fact information used incomplete imperfect human brain inconsistent thus decision made taken within bounds limitations. theory semi-bounded rationality, signal processing used filter noise outliers information correlation machine applied complete missing information artificial intelligence used make consistent decisions.",4 "deep architecture semantic parsing. many successful approaches semantic parsing build top syntactic analysis text, make use distributional representations statistical models match parses ontology-specific queries. paper presents novel deep learning architecture provides semantic parsing system union two neural models language semantics. allows generation ontology-specific queries natural language statements questions without need parsing, makes especially suitable grammatically malformed syntactically atypical text, tweets, well permitting development semantic parsers resource-poor languages.",4 "training convolutional neural network appearance-invariant place recognition. place recognition one challenging problems computer vision, become key part mobile robotics autonomous driving applications performing loop closure visual slam systems. moreover, difficulty recognizing revisited location increases appearance changes caused, instance, weather illumination variations, hinders long-term application algorithms real environments. paper present convolutional neural network (cnn), trained first time purpose recognizing revisited locations severe appearance changes, maps images low dimensional space euclidean distances represent place dissimilarity. order network learn desired invariances, train triplets images selected datasets present challenging variability visual appearance. triplets selected way two samples location third one taken different place. validate system extensive experimentation, demonstrate better performance state-of-art algorithms number popular datasets.",4 "development evaluation deep learning model protein-ligand binding affinity prediction. structure based ligand discovery one successful approaches augmenting drug discovery process. currently, notable shift towards machine learning (ml) methodologies aid procedures. deep learning recently gained considerable attention allows model ""learn"" extract features relevant task hand. developed novel deep neural network estimating binding affinity ligand-receptor complexes. complex represented 3d grid, model utilizes 3d convolution produce feature map representation, treating atoms proteins ligands manner. network tested casf ""scoring power"" benchmark astex diverse set outperformed classical scoring functions. model, together usage instructions examples, available git repository http://gitlab.com/cheminfibb/pafnucy",19 "deep learning conditional random fields-based depth estimation topographical reconstruction conventional endoscopy. colorectal cancer fourth leading cause cancer deaths worldwide second leading cause united states. risk colorectal cancer mitigated identification removal premalignant lesions optical colonoscopy. unfortunately, conventional colonoscopy misses 20% polyps removed, due part poor contrast lesion topography. imaging tissue topography colonoscopy difficult size constraints endoscope deforming mucosa. existing methods make geometric assumptions incorporate priori information, limits accuracy sensitivity. paper, present method avoids restrictions, using joint deep convolutional neural network-conditional random field (cnn-crf) framework. estimated depth used reconstruct topography surface colon single image. train unary pairwise potential functions crf cnn synthetic data, generated developing endoscope camera model rendering 100,000 images anatomically-realistic colon. validate approach real endoscopy images porcine colon, transferred synthetic-like domain, ground truth registered computed tomography measurements. cnn-crf approach estimates depths relative error 0.152 synthetic endoscopy images 0.242 real endoscopy images. show estimated depth maps used reconstructing topography mucosa conventional colonoscopy images. approach easily integrated existing endoscopy systems provides foundation improving computer-aided detection algorithms detection, segmentation classification lesions.",4 "deep multi-view spatial-temporal network taxi demand prediction. taxi demand prediction important building block enabling intelligent transportation systems smart city. accurate prediction model help city pre-allocate resources meet travel demand reduce empty taxis streets waste energy worsen traffic congestion. increasing popularity taxi requesting services uber didi chuxing (in china), able collect large-scale taxi demand data continuously. utilize big data improve demand prediction interesting critical real-world problem. traditional demand prediction methods mostly rely time series forecasting techniques, fail model complex non-linear spatial temporal relations. recent advances deep learning shown superior performance traditionally challenging tasks image classification learning complex features correlations large-scale data. breakthrough inspired researchers explore deep learning techniques traffic prediction problems. however, existing methods traffic prediction considered spatial relation (e.g., using cnn) temporal relation (e.g., using lstm) independently. propose deep multi-view spatial-temporal network (dmvst-net) framework model spatial temporal relations. specifically, proposed model consists three views: temporal view (modeling correlations future demand values near time points via lstm), spatial view (modeling local spatial correlation via local cnn), semantic view (modeling correlations among regions sharing similar temporal patterns). experiments large-scale real taxi demand data demonstrate effectiveness approach state-of-the-art methods.",4 "learning point count. paper proposes problem point-and-count test case break what-and-where deadlock. different traditional detection problem, goal discover key salient points way localize count number objects simultaneously. propose two alternatives, one counts first point, another works way around. fundamentally, pivot around whether solve ""what"" ""where"" first. evaluate performance dataset contains multiple instances class, demonstrating potentials synergies. experiences derive important insights explains much harder problem classification, including strong data bias inability deal object scales robustly state-of-art convolutional neural networks.",4 "automatic image de-fencing system. tourists wild-life photographers often hindered capturing cherished images videos fence limits accessibility scene interest. situation exacerbated growing concerns security public places need exists provide tool used post-processing fenced videos produce de-fenced image. several challenges problem, identify robust detection fence/occlusions estimating pixel motion background scenes filling fence/occlusions utilizing information multiple frames input video. work, aim build automatic post-processing tool efficiently rid input video occlusion artifacts like fences. work distinguished two major contributions. first introduction learning based technique detect fences patterns complicated backgrounds. second formulation objective function minimization loopy belief propagation fill-in fence pixels. observe grids histogram oriented gradients descriptor using support vector machines based classifier significantly outperforms detection accuracy texels lattice. present results experiments using several real-world videos demonstrate effectiveness proposed fence detection de-fencing algorithm.",4 "web-based question answering: decision-making perspective. describe investigation use probabilistic models cost-benefit analyses guide resource-intensive procedures used web-based question answering system. first provide overview research question-answering systems. then, present details askmsr, prototype web-based question answering system. discuss bayesian analyses quality answers generated system show endow system ability make decisions number queries issued search engine, given cost queries expected value query results refining ultimate answer. finally, review results set experiments.",4 "efficient marginal likelihood computation gaussian process regression. bayesian learning setting, posterior distribution predictive model arises trade-off prior distribution conditional likelihood observed data. distribution functions usually rely additional hyperparameters need tuned order achieve optimum predictive performance; operation efficiently performed empirical bayes fashion maximizing posterior marginal likelihood observed data. since score function optimization problem general characterized presence local optima, necessary resort global optimization strategies, require large number function evaluations. given evaluation usually computationally intensive badly scaled respect dataset size, maximum number observations treated simultaneously quite limited. paper, consider case hyperparameter tuning gaussian process regression. straightforward implementation posterior log-likelihood model requires o(n^3) operations every iteration optimization procedure, n number examples input dataset. derive novel set identities allow, initial overhead o(n^3), evaluation score function, well jacobian hessian matrices, o(n) operations. prove proposed identities, follow eigendecomposition kernel matrix, yield reduction several orders magnitude computation time hyperparameter optimization problem. notably, proposed solution provides computational advantages even respect state art approximations rely sparse kernel matrices.",19 "quantitative entropy study language complexity. study entropy chinese english texts, based characters case chinese texts based words languages. significant differences found languages different personal styles debating partners. entropy analysis points direction lower entropy, higher complexity. text analysis would applied individuals different styles, single individual different age, well different groups population.",4 "multi-objective design quantum circuits using genetic programming. quantum computing new way data processing based concept quantum mechanics. quantum circuit design process converting quantum gate series basic gates divided two general categories based decomposition composition. second group, using evolutionary algorithms especially genetic algorithms, multiplication matrix gates used achieve final characteristic quantum circuit. genetic programming subfield evolutionary computing computer programs evolve solve studied problems. past research done field quantum circuits design, one cost metrics (usually quantum cost) investigated. paper first time, multi-objective approach provided design quantum circuits using genetic programming considers depth cost nearest neighbor metrics addition quantum cost metric. another innovation article use two-step fitness function taking account equivalence global phase quantum gates. results show proposed method able find good answer short time.",4 "implementing test strategy advanced video acquisition processing architecture. paper presents aspects related test process advanced video system used remote ip surveillance. system based pentium compatible architecture using industrial standard pc104+. first overall architecture system presented, involving hardware software aspects. acquisition board developed special, nonstandard architecture, also briefly presented. main purpose research set coherent set procedures order test aspects video acquisition board. accomplish this, necessary set-up procedure two steps: stand alone video board test (functional test) in-system test procedure verifying compatibility os: linux windows. paper presents also results obtained using procedure.",4 "tensorizing generative adversarial nets. generative adversarial network (gan) variants demonstrate state-of-the-art performance class generative models. capture higher dimensional distributions, common learning procedure requires high computational complexity large number parameters. paper, present new generative adversarial framework representing layer tensor structure connected multilinear operations, aiming reduce number model parameters large factor preserving quality generalized performance. learn model, develop efficient algorithm alternating optimization mode connections. experimental results demonstrate model achieve high compression rate model parameters 40 times compared existing gan.",4 "adam: method stochastic optimization. introduce adam, algorithm first-order gradient-based optimization stochastic objective functions, based adaptive estimates lower-order moments. method straightforward implement, computationally efficient, little memory requirements, invariant diagonal rescaling gradients, well suited problems large terms data and/or parameters. method also appropriate non-stationary objectives problems noisy and/or sparse gradients. hyper-parameters intuitive interpretations typically require little tuning. connections related algorithms, adam inspired, discussed. also analyze theoretical convergence properties algorithm provide regret bound convergence rate comparable best known results online convex optimization framework. empirical results demonstrate adam works well practice compares favorably stochastic optimization methods. finally, discuss adamax, variant adam based infinity norm.",4 "contradiction detection rumorous claims. utilization social media material journalistic workflows increasing, demanding automated methods identification mis- disinformation. since textual contradiction across social media posts signal rumorousness, seek model claims twitter posts textually contradicted. identify two different contexts contradiction emerges: broader form observed across independently posted tweets specific form threaded conversations. define two scenarios differ terms central elements argumentation: claims conversation structure. design evaluate models two scenarios uniformly 3-way recognizing textual entailment tasks order represent claims conversation structure implicitly generic inference model, previous studies used explicit representation properties. address noisy text, classifiers use simple similarity features derived string part-of-speech level. corpus statistics reveal distribution differences features contradictory opposed non-contradictory tweet relations, classifiers yield state art performance.",4 "crossing dependencies really scarce?. syntactic structure sentence modelled tree, vertices correspond words edges indicate syntactic dependencies. claimed recurrently number edge crossings real sentences small. however, baseline null hypothesis lacking. quantify amount crossings real sentences compare predictions series baselines. conclude crossings really scarce real sentences. scarcity unexpected hubiness trees. indeed, real sentences close linear trees, potential number crossings maximized.",15 "heuristic algorithms obtaining polynomial threshold functions low densities. paper present several heuristic algorithms, including genetic algorithm (ga), obtaining polynomial threshold function (ptf) representations boolean functions (bfs) small number monomials. compare among algorithm oztop via computational experiments. results indicate heuristic algorithms find parsimonious representations compared non-heuristic ga-based algorithms.",4 "comparison multi-task convolutional neural network (mt-cnn) methods toxicity prediction. toxicity analysis prediction paramount importance human health environmental protection. existing computational methods built wide variety descriptors regressors, makes performance analysis difficult. example, deep neural network (dnn), successful approach many occasions, acts like black box offers little conceptual elegance physical understanding. present work constructs common set microscopic descriptors based established physical models charges, surface areas free energies assess performance multi-task convolutional neural network (mt-cnn) architectures approaches, including random forest (rf) gradient boosting decision tree (gbdt), equal footing. comparison also given convolutional neural network (cnn) non-convolutional deep neural network (dnn) algorithms. four benchmark toxicity data sets (i.e., endpoints) used evaluate various approaches. extensive numerical studies indicate present mt-cnn architecture able outperform state-of-the-art methods.",16 "coverless information hiding based generative model. new coverless image information hiding method based generative model proposed, feed secret image generative model database, generate meaning-normal independent image different secret image, then, generated image transmitted receiver fed generative model database generate another image visually secret image. need transmit meaning-normal image related secret image, achieve effect transmission secret image. first time propose coverless image information hiding method based generative model, compared traditional image steganography, transmitted image embed information secret image method, therefore, effectively resist steganalysis tools. experimental results show method high capacity, safety reliability.",4 "complexity normal form rewrite sequences associativity. complexity particular term-rewrite system considered: rule associativity (x*y)*z --> x*(y*z). algorithms exact calculations given longest shortest sequences applications --> result normal form (nf). shortest nf sequence term x always n-drm(x), n number occurrences * x drm(x) depth rightmost leaf x. longest nf sequence term length n(n-1)/2.",2 "multimodal recurrent neural networks information transfer layers indoor scene labeling. paper proposes new method called multimodal rnns rgb-d scene semantic segmentation. optimized classify image pixels given two input sources: rgb color channels depth maps. simultaneously performs training two recurrent neural networks (rnns) crossly connected information transfer layers, learnt adaptively extract relevant cross-modality features. rnn model learns representations previous hidden states transferred patterns rnns previous hidden states; thus, model-specific crossmodality features retained. exploit structure quad-directional 2d-rnns model short long range contextual information 2d input image. carefully designed various baselines efficiently examine proposed model structure. test multimodal rnns method popular rgb-d benchmarks show outperforms previous methods significantly achieves competitive results state-of-the-art works.",4 "computing web-scale topic models using asynchronous parameter server. topic models latent dirichlet allocation (lda) widely used information retrieval tasks ranging smoothing feedback methods tools exploratory search discovery. however, classical methods inferring topic models scale massive size today's publicly available web-scale data sets. state-of-the-art approaches rely custom strategies, implementations hardware facilitate asynchronous, communication-intensive workloads. present aps-lda, integrates state-of-the-art topic modeling cluster computing frameworks spark using novel asynchronous parameter server. advantages integration include convenient usage existing data processing pipelines eliminating need disk writes data kept memory start finish. goal outperform highly customized implementations, propose general high-performance topic modeling framework easily used today's data processing pipelines. compare aps-lda existing spark lda implementations show system can, 480-core cluster, process 135 times data 10 times topics without sacrificing model quality.",4 "evaluation output embeddings fine-grained image classification. image classification advanced significantly recent years availability large-scale image sets. however, fine-grained classification remains major challenge due annotation cost large numbers fine-grained categories. project shows compelling classification performance achieved categories even without labeled training data. given image class embeddings, learn compatibility function matching embeddings assigned higher score mismatching ones; zero-shot classification image proceeds finding label yielding highest joint compatibility score. use state-of-the-art image features focus different supervised attributes unsupervised output embeddings either derived hierarchies learned unlabeled text corpora. establish substantially improved state-of-the-art animals attributes caltech-ucsd birds datasets. encouragingly, demonstrate purely unsupervised output embeddings (learned wikipedia improved fine-grained text) achieve compelling results, even outperforming previous supervised state-of-the-art. combining different output embeddings, improve results.",4 "disentangling 3d pose dendritic cnn unconstrained 2d face alignment. heatmap regression used landmark localization quite now. methods use deep stack bottleneck modules heatmap classification stage, followed heatmap regression extract keypoints. paper, present single dendritic cnn, termed pose conditioned dendritic convolution neural network (pcd-cnn), classification network followed second modular classification network, trained end end fashion obtain accurate landmark points. following bayesian formulation, disentangle 3d pose face image explicitly conditioning landmark estimation pose, making different multi-tasking approaches. extensive experimentation shows conditioning pose reduces localization error making agnostic face pose. proposed model extended yield variable number landmark points hence broadening applicability datasets. instead increasing depth width network, train cnn efficiently mask-softmax loss hard sample mining achieve upto $15\%$ reduction error compared state-of-the-art methods extreme medium pose face images challenging datasets including aflw, afw, cofw ibug.",4 "langpro: natural language theorem prover. langpro automated theorem prover natural language (https://github.com/kovvalsky/langpro). given set premises hypothesis, able prove semantic relations them. prover based version analytic tableau method specially designed natural logic. proof procedure operates logical forms preserve linguistic expressions large extent. %this property makes logical forms easily obtainable syntactic trees. %, particular, combinatory categorial grammar derivation trees. nature proofs deductive transparent. fracas sick textual entailment datasets, prover achieves high results comparable state-of-the-art.",4 "deterministic mdps adversarial rewards bandit feedback. consider markov decision process deterministic state transition dynamics, adversarially generated rewards change arbitrarily round round, bandit feedback model decision maker observes rewards receives. setting, present novel efficient online decision making algorithm named marcopolo. mild assumptions structure transition dynamics, prove marcopolo enjoys regret o(t^(3/4)sqrt(log(t))) best deterministic policy hindsight. specifically, analysis rely stringent unichain assumption, dominates much previous work topic.",4 "trace norm regularization faster inference embedded speech recognition rnns. propose evaluate new techniques compressing speeding dense matrix multiplications found fully connected recurrent layers neural networks embedded large vocabulary continuous speech recognition (lvcsr). compression, introduce study trace norm regularization technique training low rank factored versions matrix multiplications. compared standard low rank training, show method leads good accuracy versus number parameter trade-offs used speed training large models. speedup, enable faster inference arm processors new open sourced kernels optimized small batch sizes, resulting 3x 7x speed ups widely used gemmlowp library. beyond lvcsr, expect techniques kernels generally applicable embedded neural networks large fully connected recurrent layers.",4 "exploring speech enhancement generative adversarial networks robust speech recognition. investigate effectiveness generative adversarial networks (gans) speech enhancement, context improving noise robustness automatic speech recognition (asr) systems. prior work demonstrates gans effectively suppress additive noise raw waveform speech signals, improving perceptual quality metrics; however technique justified context asr. work, conduct detailed study measure effectiveness gans enhancing speech contaminated additive reverberant noise. motivated recent advances image processing, propose operating gans log-mel filterbank spectra instead waveforms, requires less computation robust reverberant noise. gan enhancement improves performance clean-trained asr system noisy speech, falls short performance achieved conventional multi-style training (mtr). appending gan-enhanced features noisy inputs retraining, achieve 7% wer improvement relative mtr system.",4 "brain eeg time series selection: novel graph-based approach classification. brain electroencephalography (eeg) classification widely applied analyze cerebral diseases recent years. unfortunately, invalid/noisy eegs degrade diagnosis performance previously developed methods ignore necessity eeg selection classification. end, paper proposes novel maximum weight clique-based eeg selection approach, named mwceegs, map eeg selection searching maximum similarity-weighted cliques improved fr\'{e}chet distance-weighted undirected eeg graph simultaneously considering edge weights vertex weights. mwceegs improves classification performance selecting intra-clique pairwise similar inter-clique discriminative eegs similarity threshold $\delta$. experimental results demonstrate algorithm effectiveness compared state-of-the-art time series selection algorithms real-world eeg datasets.",4 "sequential dual deep learning shape texture features sketch recognition. recognizing freehand sketches high arbitrariness greatly challenging. existing methods either ignore geometric characteristics treat sketches handwritten characters fixed structural ordering. consequently, hardly yield high recognition performance even though sophisticated learning techniques employed. paper, propose sequential deep learning strategy combines shape texture features. coded shape descriptor exploited characterize geometry sketch strokes high flexibility, outputs constitutional neural networks (cnn) taken abstract texture feature. develop dual deep networks memorable gated recurrent units (grus), sequentially feed two types features dual networks, respectively. dual networks enable feature fusion another gated recurrent unit (gru), thus accurately recognize sketches invariant stroke ordering. experiments tu-berlin data set show method outperforms average human state-of-the-art algorithms even significant shape appearance variations occur.",4 "prediction advice unknown number experts. framework prediction expert advice, consider recently introduced kind regret bounds: bounds depend effective instead nominal number experts. contrast normalhedge bound, mainly depends effective number experts also weakly depends nominal one, obtain bound contain nominal number experts all. use defensive forecasting method introduce application defensive forecasting multivalued supermartingales.",4 "hnp3: hierarchical nonparametric point process modeling content diffusion social media. paper introduces novel framework modeling temporal events complex longitudinal dependency generated dependent sources. framework takes advantage multidimensional point processes modeling time events. intensity function proposed process mixture intensities, complexity grows complexity temporal patterns data. moreover, utilizes hierarchical dependent nonparametric approach model marks events. capabilities allow proposed model adapt temporal topical complexity according complexity data, makes suitable candidate real world scenarios. online inference algorithm also proposed makes framework applicable vast range applications. framework applied real world application, modeling diffusion contents networks. extensive experiments reveal effectiveness proposed framework comparison state-of-the-art methods.",19 "nonlinear information bottleneck. information bottleneck [ib] technique extracting information `input' random variable relevant predicting different 'output' random variable. ib works encoding input compressed 'bottleneck variable' output accurately decoded. ib difficult compute practice, mainly developed two limited cases: (1) discrete random variables small state spaces, (2) continuous random variables jointly gaussian distributed (in case encoding decoding maps linear). propose method perform ib general domains. approach applied discrete continuous inputs outputs, allows nonlinear encoding decoding maps. method uses novel upper bound ib objective, derived using non-parametric estimator mutual information variational approximation. show implement method using neural networks gradient-based optimization, demonstrate performance mnist dataset.",4 "sk_p: neural program corrector moocs. present novel technique automatic program correction moocs, capable fixing syntactic semantic errors without manual, problem specific correction strategies. given incorrect student program, generates candidate programs distribution likely corrections, checks candidate correctness test suite. key observation moocs many programs share similar code fragments, seq2seq neural network model, used natural-language processing task machine translation, modified trained recover fragments. experiment shows scheme correct 29% incorrect submissions out-performs state art approach requires manual, problem specific correction strategies.",4 "towards optimal learning chain graphs. paper, extend meek's conjecture (meek 1997) directed acyclic graphs chain graphs, prove extended conjecture true. specifically, prove chain graph h independence map independence model induced another chain graph g, (i) g transformed h sequence directed undirected edge additions feasible splits mergings, (ii) operation sequence h remains independence map independence model induced g. result important consequence learning chain graphs data proof meek's conjecture (chickering 2002) learning bayesian networks data: makes possible develop efficient asymptotically correct learning algorithms mild assumptions.",19 "disjunctive logic programs inheritance. paper proposes new knowledge representation language, called dlp<, extends disjunctive logic programming (with strong negation) inheritance. addition inheritance enhances knowledge modeling features language providing natural representation default reasoning exceptions. declarative model-theoretic semantics dlp< provided, shown generalize answer set semantics disjunctive logic programs. knowledge modeling features language illustrated encoding classical nonmonotonic problems dlp<. complexity dlp< analyzed, proving inheritance cause computational overhead, reasoning dlp< exactly complexity reasoning disjunctive logic programming. confirmed existence efficient translation dlp< plain disjunctive logic programming. using translation, advanced kr system supporting dlp< language implemented top dlv system subsequently integrated dlv.",4 "recurrent neural network encoder attention community question answering. apply general recurrent neural network (rnn) encoder framework community question answering (cqa) tasks. approach rely linguistic processing, applied different languages domains. improvements observed extend rnn encoders neural attention mechanism encourages reasoning entire sequences. deal practical issues data sparsity imbalanced labels, apply various techniques transfer learning multitask learning. experiments semeval-2016 cqa task show 10% improvement map score compared information retrieval-based approach, achieve comparable performance strong handcrafted feature-based method.",4 "dna reservoir computing: novel molecular computing approach. propose novel molecular computing approach based reservoir computing. reservoir computing, dynamical core, called reservoir, perturbed external input signal readout layer maps reservoir dynamics target output. computation takes place transformation input space high-dimensional spatiotemporal feature space created transient dynamics reservoir. readout layer combines features produce target output. show coupled deoxyribozyme oscillators act reservoir. show despite using three coupled oscillators, molecular reservoir computer could achieve 90% accuracy benchmark temporal problem.",4 "differentiable transition additive multiplicative neurons. existing approaches combine additive multiplicative neural units either use fixed assignment operations require discrete optimization determine function neuron perform. however, leads extensive increase computational complexity training procedure. present novel, parameterizable transfer function based mathematical concept non-integer functional iteration allows operation neuron performs smoothly and, importantly, differentiablely adjusted addition multiplication. allows decision addition multiplication integrated standard backpropagation training procedure.",4 "foundations brussels operational-realistic approach cognition. scientific community becoming interested research applies mathematical formalism quantum theory model human decision-making. paper, provide theoretical foundations quantum approach cognition developed brussels. foundations rest results two decade studies axiomatic operational-realistic approaches foundations quantum physics. deep analogies foundations physics cognition lead us investigate validity quantum theory general unitary framework cognitive processes, empirical success hilbert space models derived investigation provides strong theoretical confirmation validity. however, two situations cognitive realm, 'question order effects' 'response replicability', indicate even hilbert space framework could insufficient reproduce collected data. mean mentioned operational-realistic approach would incorrect, simply larger class measurements would force human cognition, extended quantum formalism may needed deal them. explain, recently derived 'extended bloch representation' quantum theory (and associated 'general tension-reduction' model) precisely provides extended formalism, remaining within unitary interpretative framework.",4 "adjuncts processing lexical rules. standard hpsg analysis germanic verb clusters explain observed narrow-scope readings adjuncts verb clusters. present extension hpsg analysis accounts systematic ambiguity scope adjuncts verb cluster constructions, treating adjuncts members subcat list. extension uses powerful recursive lexical rules, implemented complex constraints. show `delayed evaluation' techniques constraint-logic programming used process lexical rules.",2 "using quaternion's representation individuals swarm intelligence evolutionary computation. paper introduces novel idea representation individuals using quaternions swarm intelligence evolutionary algorithms. quaternions number system, extends complex numbers. successfully applied problems theoretical physics areas needing fast rotation calculations. propose application quaternions optimization, precisely, using quaternions representation individuals bat algorithm. preliminary results experiments optimizing test-suite consisting ten standard functions showed new algorithm significantly improved results original bat algorithm. moreover, obtained results comparable swarm intelligence evolutionary algorithms, like artificial bees colony, differential evolution. believe representation could also successfully applied swarm intelligence evolutionary algorithms.",4 "learning observe. process diagnosis involves learning state system various observations symptoms findings system. sophisticated bayesian (and other) algorithms developed revise maintain beliefs system observations made. nonetheless, diagnostic models tended ignore common sense reasoning exploited human diagnosticians; particular, one learn observations made, spirit conversational implicature. two concepts describe extract information observations made. first, symptoms, present, likely reported others. second, human diagnosticians expert systems economical data-gathering, searching first likely find symptoms present. thus, desirable bias toward reporting symptoms present. develop simple model concepts significantly improve diagnostic inference.",4 "skynet: efficient robust neural network training tool machine learning astronomy. present first public release generic neural network training algorithm, called skynet. efficient robust machine learning tool able train large deep feed-forward neural networks, including autoencoders, use wide range supervised unsupervised learning applications, regression, classification, density estimation, clustering dimensionality reduction. skynet uses `pre-training' method obtain set network parameters empirically shown close good solution, followed optimisation using regularised variant newton's method, level regularisation determined adjusted automatically; latter uses second-order derivative information improve convergence, without need evaluate store full hessian matrix, using fast approximate method calculate hessian-vector products. combination methods allows training complicated networks difficult optimise using standard backpropagation techniques. skynet employs convergence criteria naturally prevent overfitting, also includes fast algorithm estimating accuracy network outputs. utility flexibility skynet demonstrated application number toy problems, astronomical problems focusing recovery structure blurred noisy images, identification gamma-ray bursters, compression denoising galaxy images. skynet software, implemented standard ansi c fully parallelised using mpi, available http://www.mrao.cam.ac.uk/software/skynet/.",1 "item2vec: neural item embedding collaborative filtering. many collaborative filtering (cf) algorithms item-based sense analyze item-item relations order produce item similarities. recently, several works field natural language processing (nlp) suggested learn latent representation words using neural embedding algorithms. among them, skip-gram negative sampling (sgns), also known word2vec, shown provide state-of-the-art results various linguistics tasks. paper, show item-based cf cast framework neural word embedding. inspired sgns, describe method name item2vec item-based cf produces embedding items latent space. method capable inferring item-item relations even user information available. present experimental results demonstrate effectiveness item2vec method show competitive svd.",4 "assigning satisfaction values constraints: algorithm solve dynamic meta-constraints. model dynamic meta-constraints special activity constraints activate constraints. also meta-constraints range constraints. algorithm presented constraints assigned one five different satisfaction values, leads assignment domain values variables csp. outline model algorithm presented, followed initial results two problems: simple classic csp car configuration problem. algorithm shown perform backtracks per solution, overheads form historical records required implementation state.",4 alternative restart strategies cma-es. paper focuses restart strategy cma-es multi-modal functions. first alternative strategy proceeds decreasing initial step-size mutation doubling population size restart. second strategy adaptively allocates computational budget among restart settings bipop scheme. restart strategies validated bbob benchmark; generality also demonstrated independent real-world problem suite related spacecraft trajectory optimization.,4 "learning phrase representations using rnn encoder-decoder statistical machine translation. paper, propose novel neural network model called rnn encoder-decoder consists two recurrent neural networks (rnn). one rnn encodes sequence symbols fixed-length vector representation, decodes representation another sequence symbols. encoder decoder proposed model jointly trained maximize conditional probability target sequence given source sequence. performance statistical machine translation system empirically found improve using conditional probabilities phrase pairs computed rnn encoder-decoder additional feature existing log-linear model. qualitatively, show proposed model learns semantically syntactically meaningful representation linguistic phrases.",4 "deepmind control suite. deepmind control suite set continuous control tasks standardised structure interpretable rewards, intended serve performance benchmarks reinforcement learning agents. tasks written python powered mujoco physics engine, making easy use modify. include benchmarks several learning algorithms. control suite publicly available https://www.github.com/deepmind/dm_control . video summary tasks available http://youtu.be/raai4qzcybs .",4 "abstract machine typed feature structures. paper describes first step towards definition abstract machine linguistic formalisms based typed feature structures, hpsg. core design abstract machine given detail, including compilation process high-level specification language abstract machine language implementation abstract instructions. thus apply methods proved useful computer science study natural languages: grammar specified using formalism endowed operational semantics. currently, machine supports unification simple feature structures, unification sequences structures, cyclic structures disjunction.",2 "study clear sky models singapore. estimation total solar irradiance falling earth's surface important field solar energy generation forecasting. several clear-sky solar radiation models developed last decades. models based empirical distribution various geographical parameters; models consider various atmospheric effects solar energy estimation. paper, perform comparative analysis several popular clear-sky models, tropical region singapore. important countries like singapore, primarily focused reliable efficient solar energy generation. analyze compare three popular clear-sky models widely used literature. validate solar estimation results using actual solar irradiance measurements obtained collocated weather stations. finally conclude reliable clear sky model singapore, based clear sky days year.",15 preference via entrenchment. introduce simple generalization gardenfors makinson's epistemic entrenchment called partial entrenchment. show preferential inference generated sceptical counterpart inference mechanism defined directly partial entrenchment.,4 "memory-based control recurrent neural networks. partially observed control problems challenging aspect reinforcement learning. extend two related, model-free algorithms continuous control -- deterministic policy gradient stochastic value gradient -- solve partially observed domains using recurrent neural networks trained backpropagation time. demonstrate approach, coupled long-short term memory able solve variety physical control problems exhibiting assortment memory requirements. include short-term integration information noisy sensors identification system parameters, well long-term memory problems require preserving information many time steps. also demonstrate success combined exploration memory problem form simplified version well-known morris water maze task. finally, show approach deal high-dimensional observations learning directly pixels. find recurrent deterministic stochastic policies able learn similarly good solutions tasks, including water maze agent must learn effective search strategies.",4 "linking search space structure, run-time dynamics, problem difficulty: step toward demystifying tabu search. tabu search one effective heuristics locating high-quality solutions diverse array np-hard combinatorial optimization problems. despite widespread success tabu search, researchers poor understanding many key theoretical aspects algorithm, including models high-level run-time dynamics identification search space features influence problem difficulty. consider questions context job-shop scheduling problem (jsp), domain tabu search algorithms shown remarkably effective. previously, demonstrated mean distance random local optima nearest optimal solution highly correlated problem difficulty well-known tabu search algorithm jsp introduced taillard. paper, discuss various shortcomings measure develop new model problem difficulty corrects deficiencies. show taillards algorithm modeled high fidelity simple variant straightforward random walk. random walk model accounts nearly variability cost required locate optimal sub-optimal solutions random jsps, provides explanation differences difficulty random versus structured jsps. finally, discuss empirically substantiate two novel predictions regarding tabu search algorithm behavior. first, method constructing initial solution highly unlikely impact performance tabu search. second, tabu tenure selected small possible simultaneously avoiding search stagnation; values larger necessary lead significant degradations performance.",4 "large kernel matters -- improve semantic segmentation global convolutional network. one recent trends [30, 31, 14] network architec- ture design stacking small filters (e.g., 1x1 3x3) entire network stacked small filters ef- ficient large kernel, given computational complexity. however, field semantic segmenta- tion, need perform dense per-pixel prediction, find large kernel (and effective receptive field) plays important role perform clas- sification localization tasks simultaneously. following design principle, propose global convolutional network address classification localization issues semantic segmentation. also suggest residual-based boundary refinement refine ob- ject boundaries. approach achieves state-of-art perfor- mance two public benchmarks significantly outper- forms previous results, 82.2% (vs 80.2%) pascal voc 2012 dataset 76.9% (vs 71.8%) cityscapes dataset.",4 "unsupervised feature learning audio analysis. identifying acoustic events continuously streaming audio source interest many applications including environmental monitoring basic research. scenario neither different event classes known distinguishes one class another. therefore, unsupervised feature learning method exploration audio data presented paper. incorporates two following novel contributions: first, audio frame predictor based convolutional lstm autoencoder demonstrated, used unsupervised feature extraction. second, training method autoencoders presented, leads distinct features amplifying event similarities. comparison standard approaches, features extracted audio frame predictor trained novel approach show 13 % better results used classifier 36 % better results used clustering.",4 "learning structural weight uncertainty sequential decision-making. learning probability distributions weights neural networks (nns) recently proven beneficial many applications. bayesian methods, stein variational gradient descent (svgd), offer elegant framework reason nn model uncertainty. however, assuming independent gaussian priors individual nn weights (as often applied), svgd impose prior knowledge often structural information (dependence) among weights. propose efficient posterior learning structural weight uncertainty, within svgd framework, employing matrix variate gaussian priors nn parameters. investigate learned structural uncertainty sequential decision-making problems, including contextual bandits reinforcement learning. experiments several synthetic real datasets indicate superiority model, compared state-of-the-art methods.",19 "cnn-based spatial feature fusion algorithm hyperspectral imagery classification. shortage training samples remains one main obstacles applying artificial neural networks (ann) hyperspectral images classification. fuse spatial spectral information, pixel patches often utilized train model, may aggregate problem. existing works, ann model supervised center-loss (annc) introduced. training merely spectral information, annc yields discriminative spectral features suitable subsequent classification tasks. paper, cnn-based spatial feature fusion (csff) algorithm proposed, allows smart fusion spatial information spectral features extracted annc. critical part csff, cnn-based discriminant model introduced estimate whether two paring pixels belong class. testing stage, applying discriminant model pixel-pairs generated test pixel neighbors, local structure estimated represented customized convolutional kernel. spectral-spatial feature obtained convolutional operation estimated kernel corresponding spectral features within neighborhood. last, label test pixel predicted classifying resulting spectral-spatial feature. without increasing number training samples involving pixel patches training stage, csff framework achieves state-of-the-art declining $20\%-50\%$ classification failures experiments three well-known hyperspectral images.",4 "galileo: generalized low-entropy mixture model. present new method generating mixture models data categorical attributes. keys approach entropy-based density metric categorical space annealing high-entropy/low-density components initial state many components. pruning low-density components using entropy-based density allows galileo consistently find high-quality clusters optimal number clusters. galileo shown promising results range test datasets commonly used categorical clustering benchmarks. demonstrate scaling galileo linear number records dataset, making method suitable large categorical datasets.",19 "reconstructive sparse code transfer contour detection semantic labeling. frame task predicting semantic labeling sparse reconstruction procedure applies target-specific learned transfer function generic deep sparse code representation image. strategy partitions training two distinct stages. first, unsupervised manner, learn set generic dictionaries optimized sparse coding image patches. train multilayer representation via recursive sparse dictionary learning pooled codes output earlier layers. second, encode training images generic dictionaries learn transfer function optimizes reconstruction patches extracted annotated ground-truth given sparse codes corresponding image patches. test time, encode novel image using generic dictionaries reconstruct using transfer function. output reconstruction semantic labeling test image. applying strategy task contour detection, demonstrate performance competitive state-of-the-art systems. unlike almost prior work, approach obviates need form hand-designed features filters. illustrate general applicability, also show initial results semantic part labeling human faces. effectiveness approach opens new avenues research deep sparse representations. classifiers utilize representation novel manner. rather acting nodes deepest layer, attach nodes along slice multiple layers network order make predictions local patches. flexible combination generatively learned sparse representation discriminatively trained transfer classifiers extends notion sparse reconstruction encompass arbitrary semantic labeling tasks.",4 "unveiling link logical fallacies web persuasion. last decade human-computer interaction (hci) started focus attention forms persuasive interaction computer technologies goal changing users behavior attitudes according predefined direction. work, hypothesize strong connection logical fallacies (forms reasoning logically invalid cognitively effective) common persuasion strategies adopted within web technologies. aim empirically evaluating hypothesis, carried pilot study sample 150 e-commerce websites.",4 "contextual position-aware factorization machines sentiment classification. existing machine learning models achieved great success sentiment classification, typically explicitly capture sentiment-oriented word interaction, lead poor results fine-grained analysis snippet level (a phrase sentence). factorization machine provides possible approach learning element-wise interaction recommender systems, directly applicable task due inability model contexts word sequences. work, develop two position-aware factorization machines consider word interaction, context position information. information jointly encoded set sentiment-oriented word interaction vectors. compared traditional word embeddings, swi vectors explicitly capture sentiment-oriented word interaction simplify parameter learning. experimental results show comparable performance state-of-the-art methods document-level classification, benefit snippet/sentence-level sentiment analysis.",4 "mathematical programming strategies solving minimum common string partition problem. minimum common string partition problem np-hard combinatorial optimization problem applications computational biology. work propose first integer linear programming model solving problem. moreover, basis integer linear programming model develop deterministic 2-phase heuristic applicable larger problem instances. results show provenly optimal solutions obtained problem instances small medium size literature solving proposed integer linear programming model cplex. furthermore, new best-known solutions obtained considered problem instances literature. concerning heuristic, able show outperforms heuristic competitors related literature.",4 "2d geometric information really tell us 3d face shape?. face image contains geometric cues form configurational information contours used estimate 3d face shape. clear 3d reconstruction 2d points highly ambiguous constraints enforced, one might expect face-space constraint solves problem. show case geometric information ambiguous cue. two sources ambiguity. first that, within space 3d face shapes, flexibility modes remain parts face fixed. second occurs perspective projection result perspective transformation camera distance varies. two different faces, viewed different distances, give rise 2d geometry. demonstrate ambiguities, develop new algorithms fitting 3d morphable model 2d landmarks contours either orthographic perspective projection show compute flexibility modes cases. show fitting problems posed separable nonlinear least squares problem solved efficiently. provide quantitative qualitative evidence ambiguity exists synthetic data real images.",4 "robust subspace clustering via thresholding. problem clustering noisy incompletely observed high-dimensional data points union low-dimensional subspaces set outliers considered. number subspaces, dimensions, orientations assumed unknown. propose simple low-complexity subspace clustering algorithm, applies spectral clustering adjacency matrix obtained thresholding correlations data points. words, adjacency matrix constructed nearest neighbors data point spherical distance. statistical performance analysis shows algorithm exhibits robustness additive noise succeeds even subspaces intersect. specifically, results reveal explicit tradeoff affinity subspaces tolerable noise level. furthermore prove algorithm succeeds even data points incompletely observed number missing entries allowed (up log-factor) linear ambient dimension. also propose simple scheme provably detects outliers, present numerical results real synthetic data.",19 "artificial neoteny evolutionary image segmentation. neoteny, also spelled paedomorphosis, defined biological terms retention organism juvenile even larval traits later life. species, morphological development retarded; organism juvenilized sexually mature. shifts reproductive capability would appear adaptive significance organisms exhibit it. terms evolutionary theory, process paedomorphosis suggests larval stages developmental phases existing organisms may give rise, certain circumstances, wholly new organisms. although present work pretend model simulate biological details concept way, ideas incorporated rather simple abstract computational strategy, order allow (if possible) faster convergence simple non-memetic genetic algorithms, i.e. without using local improvement procedures (e.g. via baldwin lamarckian learning). case-study, genetic algorithm used colour image segmentation purposes using k-mean unsupervised clustering methods, namely guiding evolutionary algorithm search finding optimal sub-optimal data partition. average results suggest use neotonic strategies employing juvenile genotypes later generations use linear-dynamic mutation rates instead constant, increase fitness values 58% comparing classical genetic algorithms, independently starting population characteristics search space. keywords: genetic algorithms, artificial neoteny, dynamic mutation rates, faster convergence, colour image segmentation, classification, clustering.",4 "automated surgical skill assessment rmis training. purpose: manual feedback basic rmis training consume significant amount time expert surgeons' schedule prone subjectivity. vr-based training tasks generate automated score reports, mechanism generating automated feedback surgeons performing basic surgical tasks rmis training. paper, explore usage different holistic features automated skill assessment using robot kinematic data propose weighted feature fusion technique improving score prediction performance. methods: perform experiments publicly available jigsaws dataset evaluate four different types holistic features robot kinematic data - sequential motion texture (smt), discrete fourier transform (dft), discrete cosine transform (dct) approximate entropy (apen). features used skill classification exact skill score prediction. along using features individually, also evaluate performance using proposed weighted combination technique. results: results demonstrate holistic features outperform previous hmm based state-of-the-art methods skill classification jigsaws dataset. also, proposed feature fusion strategy significantly improves performance skill score predictions achieving 0.61 average spearman correlation coefficient. conclusions: holistic features capturing global information robot kinematic data successfully used evaluating surgeon skill basic surgical tasks da vinci robot. using framework presented potentially allow real time score feedback rmis training.",4 "correlation game unsupervised learning yields computational interpretations hebbian excitation, anti-hebbian inhibition, synapse elimination. much learned plasticity biological synapses empirical studies. hebbian plasticity driven correlated activity presynaptic postsynaptic neurons. synapses converge onto neuron often behave compete fixed resource; survive competition others eliminated. provide computational interpretations aspects synaptic plasticity, formulate unsupervised learning zero-sum game hebbian excitation anti-hebbian inhibition neural network model. game formalizes intuition hebbian excitation tries maximize correlations neurons inputs, anti-hebbian inhibition tries decorrelate neurons other. include model synaptic competition, enables neuron eliminate connections except strongly correlated inputs. empirical studies, show facilitates learning sensory features resemble parts objects.",4 "multi-domain collaborative filtering. collaborative filtering effective recommendation approach preference user item predicted based preferences users similar interests. big challenge using collaborative filtering methods data sparsity problem often arises user typically rates items hence rating matrix extremely sparse. paper, address problem considering multiple collaborative filtering tasks different domains simultaneously exploiting relationships domains. refer multi-domain collaborative filtering (mcf) problem. solve mcf problem, propose probabilistic framework uses probabilistic matrix factorization model rating problem domain allows knowledge adaptively transferred across different domains automatically learning correlation domains. also introduce link function different domains correct biases. experiments conducted several real-world applications demonstrate effectiveness methods compared representative methods.",4 "(1+$λ$) evolutionary algorithm self-adjusting mutation rate. propose new way self-adjust mutation rate population-based evolutionary algorithms discrete search spaces. roughly speaking, consists creating half offspring mutation rate twice current mutation rate half half current rate. mutation rate updated rate used subpopulation contains best offspring. analyze $(1+\lambda)$ evolutionary algorithm self-adjusting mutation rate optimizes onemax test function. prove dynamic version $(1+\lambda)$~ea finds optimum expected optimization time (number fitness evaluations) $o(n\lambda/\!\log\lambda+n\log n)$. time asymptotically smaller optimization time classic $(1+\lambda)$ ea. previous work shows performance best-possible among $\lambda$-parallel mutation-based unbiased black-box algorithms. result shows new way adjusting mutation rate find optimal dynamic parameter values fly. since adjustment mechanism simpler ones previously used adjusting mutation rate parameters itself, optimistic find applications.",4 "characterizing concept drift. machine learning models static, world dynamic, increasing online deployment learned models gives increasing urgency development efficient effective mechanisms address learning context non-stationary distributions, commonly called concept drift. however, key issue characterizing different types drift occur previously subjected rigorous definition analysis. particular, qualitative drift categorizations proposed, formally defined, quantitative descriptions required precise objective understanding learner performance existed. present first comprehensive framework quantitative analysis drift. supports development first comprehensive set formal definitions types concept drift. formal definitions clarify ambiguities identify gaps previous definitions, giving rise new comprehensive taxonomy concept drift types solid foundation research mechanisms detect address concept drift.",4 "one deep music representation rule all? : comparative analysis different representation learning strategies. inspired success deploying deep learning fields computer vision natural language processing, learning paradigm also found way field music information retrieval. order benefit deep learning effective, also efficient manner, deep transfer learning become common approach. approach, possible reuse output pre-trained neural network basis new, yet unseen learning task. underlying hypothesis initial new learning tasks show commonalities applied type data (e.g. music audio), generated deep representation data also informative new task. since, however, networks used generate deep representations trained using single initial learning task, validity hypothesis questionable arbitrary new learning task. paper present results investigation best ways generate deep representations data learning tasks music domain. conducted investigation via extensive empirical study involves multiple learning tasks, well multiple deep learning architectures varying levels information sharing tasks, order learn music representations. validate representations considering multiple unseen learning tasks evaluation. results experiments yield several insights approach design methods learning widely deployable deep data representations music domain.",4 "stage 4 validation satellite image automatic mapper lightweight computer program earth observation level 2 product generation, part 2 validation. european space agency (esa) defines earth observation (eo) level 2 product multispectral (ms) image corrected geometric, atmospheric, adjacency topographic effects, stacked scene classification map (scm) whose legend includes quality layers cloud cloud-shadow. esa eo level 2 product ever systematically generated ground segment. contribute toward filling information gap eo big sensory data esa eo level 2 product, stage 4 validation (val) shelf satellite image automatic mapper (siam) lightweight computer program prior knowledge based ms color naming conducted independent means. time-series annual web enabled landsat data (weld) image composites conterminous u.s. (conus) selected input dataset. annual siam weld maps conus validated comparison u.s. national land cover data (nlcd) 2006 map. test reference maps share spatial resolution spatial extent, map legends must harmonized. sake readability paper split two. previous part 1 theory provided multidisciplinary background priori color naming. present part 2 validation presents discusses stage 4 val results collected test siam weld map time series reference nlcd map original protocol wall wall thematic map quality assessment without sampling, test reference map legends differ agreement part 1. conclusions siam-weld maps instantiate level 2 scm product whose legend fao land cover classification system (lccs) taxonomy dichotomous phase (dp) level 1 vegetation/nonvegetation, level 2 terrestrial/aquatic superior lccs level.",4 "theory deep learning iib: optimization properties sgd. theory iib characterize mix theory experiments optimization deep convolutional networks stochastic gradient descent. main new result paper theoretical experimental evidence following conjecture sgd: sgd concentrates probability -- like classical langevin equation -- large volume, ""flat"" minima, selecting flat minimizers high probability also global minimizers",4 "niching archive-based gaussian estimation distribution algorithm via adaptive clustering. model-based evolutionary algorithm, estimation distribution algorithm (eda) possesses unique characteristics widely applied global optimization. however, traditional gaussian eda (geda) may suffer premature convergence high risk falling local optimum dealing multimodal problem. paper, first attempts improve performance geda utilizing historical solutions develops novel archive-based eda variant. use historical solutions enhances search efficiency eda large extent, also significantly reduces population size faster convergence could achieved. then, archive-based eda integrated novel adaptive clustering strategy solving multimodal optimization problems. taking advantage clustering strategy locating different promising areas powerful exploitation ability archive-based eda, resultant algorithm endowed strong capability finding multiple optima. verify efficiency proposed algorithm, tested set well-known niching benchmark problems compared several state-of-the-art niching algorithms. experimental results indicate proposed algorithm competitive.",4 "chemical reaction optimization set covering problem. set covering problem (scp) one representative combinatorial optimization problems, many practical applications. paper investigates development algorithm solve scp employing chemical reaction optimization (cro), general-purpose metaheuristic. tested wide range benchmark instances scp. simulation results indicate algorithm gives outstanding performance compared heuristics metaheuristics solving scp.",4 "sinkhorn distances: lightspeed computation optimal transportation distances. optimal transportation distances fundamental family parameterized distances histograms. despite appealing theoretical properties, excellent performance retrieval tasks intuitive formulation, computation involves resolution linear program whose cost prohibitive whenever histograms' dimension exceeds hundreds. propose work new family optimal transportation distances look transportation problems maximum-entropy perspective. smooth classical optimal transportation problem entropic regularization term, show resulting optimum also distance computed sinkhorn-knopp's matrix scaling algorithm speed several orders magnitude faster transportation solvers. also report improved performance classical optimal transportation distances mnist benchmark problem.",19 "bayesian deep convolutional encoder-decoder networks surrogate modeling uncertainty quantification. interested development surrogate models uncertainty quantification propagation problems governed stochastic pdes using deep convolutional encoder-decoder network similar fashion approaches considered deep learning image-to-image regression tasks. since normal neural networks data intensive cannot provide predictive uncertainty, propose bayesian approach convolutional neural nets. recently introduced variational gradient descent algorithm based stein's method scaled deep convolutional networks perform approximate bayesian inference millions uncertain network parameters. approach achieves state art performance terms predictive accuracy uncertainty quantification comparison approaches bayesian neural networks well techniques include gaussian processes ensemble methods even training data size relatively small. evaluate performance approach, consider standard uncertainty quantification benchmark problems including flow heterogeneous media defined terms limited data-driven permeability realizations. performance surrogate model developed good even though underlying structure shared input (permeability) output (flow/pressure) fields often case image-to-image regression models used computer vision problems. studies performed underlying stochastic input dimensionality $4,225$ uncertainty quantification methods fail. uncertainty propagation tasks considered predictive output bayesian statistics compared obtained monte carlo estimates.",15 "bayesian approach discovering truth conflicting sources data integration. practical data integration systems, common data sources integrated provide conflicting information entity. consequently, major challenge data integration derive complete accurate integrated records diverse sometimes conflicting sources. term challenge truth finding problem. observe sources generally reliable others, therefore good model source quality key solving truth finding problem. work, propose probabilistic graphical model automatically infer true records source quality without supervision. contrast previous methods, principled approach leverages generative process two types errors (false positive false negative) modeling two different aspects source quality. doing, also first approach designed merge multi-valued attribute types. method scalable, due efficient sampling-based inference algorithm needs iterations practice enjoys linear time complexity, even faster incremental variant. experiments two real world datasets show new method outperforms existing state-of-the-art approaches truth finding problem.",4 "stimont: core ontology multimedia stimuli description. affective multimedia documents images, sounds videos elicit emotional responses exposed human subjects. stimuli stored affective multimedia databases successfully used wide variety research psychology neuroscience areas related attention emotion processing. although important affective multimedia databases numerous deficiencies impair applicability. problems, brought forward paper, result low recall precision multimedia stimuli retrieval makes creating emotion elicitation procedures difficult labor-intensive. address issues new core ontology stimont introduced. stimont written owl-dl formalism extends w3c emotionml format expressive formal representation affective concepts, high-level semantics, stimuli document metadata elicited physiology. advantages ontology description affective multimedia stimuli demonstrated document retrieval experiment compared contemporary keyword-based querying methods. also, software tool intelligent stimulus generator retrieval affective multimedia construction stimuli sequences presented.",4 "fever: large-scale dataset fact extraction verification. unlike tasks despite recent interest, research textual claim verification hindered lack large-scale manually annotated datasets. paper introduce new publicly available dataset verification textual sources, fever: fact extraction verification. consists 185,441 claims generated altering sentences extracted wikipedia subsequently verified without knowledge sentence derived from. claims classified supported, refuted notenoughinfo annotators achieving 0.6841 fleiss $\kappa$. first two classes, annotators also recorded sentence(s) forming necessary evidence judgment. characterize challenge dataset presented, develop pipeline approach using baseline state-of-the-art components compare suitably designed oracles. best accuracy achieve labeling claim accompanied correct evidence 31.87%, ignore evidence achieve 50.91%. thus believe fever challenging testbed help stimulate progress claim verification textual sources.",4 "paraconsistency word puzzles. word puzzles problem representations logic languages received considerable attention last decade (ponnuru et al. 2004; shapiro 2011; baral dzifcak 2012; schwitter 2013). special interest problem generating representations directly natural language (nl) controlled natural language (cnl). interesting variation problem, best knowledge, scarcely explored variation context, input information inconsistent. situations, existing encodings word puzzles produce inconsistent representations break down. paper, bring well-known type paraconsistent logics, called annotated predicate calculus (apc) (kifer lozinskii 1992), bear problem. introduce new kind non-monotonic semantics apc, called consistency preferred stable models argue makes apc suitable platform dealing inconsistency word puzzles and, generally, nl sentences. also devise number general principles help user choose among different representations nl sentences, might seem equivalent but, fact, behave differently inconsistent information taken account. principles incorporated existing cnl translators, attempto controlled english (ace) (fuchs et al. 2008) peng light (white schwitter 2009). finally, show apc consistency preferred stable model semantics equivalently embedded asp preferences stable models, use embedding implement version apc clingo (gebser et al. 2011) asprin add-on (brewka et al. 2015).",4 "advances hyperspectral image classification: earth monitoring statistical learning methods. hyperspectral images show similar statistical properties natural grayscale color photographic images. however, classification hyperspectral images challenging high dimensionality pixels small number labeled examples typically available learning. peculiarities lead particular signal processing problems, mainly characterized indetermination complex manifolds. framework statistical learning gained popularity last decade. new methods presented account spatial homogeneity images, include user's interaction via active learning, take advantage manifold structure semisupervised learning, extract encode invariances, adapt classifiers image representations unseen yet similar scenes. tutuorial reviews main advances hyperspectral remote sensing image classification illustrative examples.",4 "distance-based confidence score neural network classifiers. reliable measurement confidence classifiers' predictions important many applications is, therefore, important part classifier design. yet, although deep learning received tremendous attention recent years, much progress made quantifying prediction confidence neural network classifiers. bayesian models offer mathematically grounded framework reason model uncertainty, usually come prohibitive computational costs. paper propose simple, scalable method achieve reliable confidence score, based data embedding derived penultimate layer network. investigate two ways achieve desirable embeddings, using either distance-based loss adversarial training. test benefits method used classification error prediction, weighting ensemble classifiers, novelty detection. tasks show significant improvement traditional, commonly used confidence scores.",4 "multi-path feedback recurrent neural network scene parsing. paper, consider scene parsing problem propose novel multi-path feedback recurrent neural network (mpf-rnn) parsing scene images. mpf-rnn enhance capability rnns modeling long-range context information multiple levels better distinguish pixels easy confuse. different feedforward cnns rnns single feedback, mpf-rnn propagates contextual features learned top layer \textit{multiple} weighted recurrent connections learn bottom features. better training mpf-rnn, propose new strategy considers accumulative loss multiple recurrent steps improve performance mpf-rnn parsing small objects. two novel components, mpf-rnn achieved significant improvement strong baselines (vgg16 res101) five challenging scene parsing benchmarks, including traditional siftflow, barcelona, camvid, stanford background well recently released large-scale ade20k.",4 "generating high-quality query suggestion candidates task-based search. address task generating query suggestions task-based search. current state art relies heavily suggestions provided major search engine. paper, solve task without reliance search engines. specifically, focus first step two-stage pipeline approach, dedicated generation query suggestion candidates. present three methods generating candidate suggestions apply multiple information sources. using purpose-built test collection, find methods able generate high-quality suggestion candidates.",4 "cumulative distribution networks derivative-sum-product algorithm. introduce new type graphical model called ""cumulative distribution network"" (cdn), expresses joint cumulative distribution product local functions. local function viewed providing evidence possible orderings, rankings, variables. interestingly, find conditional independence properties cdns quite different graphical models. also describe messagepassing algorithm efficiently computes conditional cumulative distributions. due unique independence properties cdn, messages general one-to-one correspondence messages exchanged standard algorithms, belief propagation. demonstrate application cdns structured ranking learning using previously-studied multi-player gaming dataset.",4 "zoom out-and-in network recursive training object proposal. paper, propose zoom-out-and-in network generating object proposals. utilize different resolutions feature maps network detect object instances various sizes. specifically, divide anchor candidates three clusters based scale size place feature maps distinct strides detect small, medium large objects, respectively. deeper feature maps contain region-level semantics help shallow counterparts identify small objects. therefore design zoom-in sub-network increase resolution high level features via deconvolution operation. high-level features high resolution combined merged low-level features detect objects. furthermore, devise recursive training pipeline consecutively regress region proposals training stage order match iterative regression testing stage. demonstrate effectiveness proposed method ilsvrc det ms coco datasets, algorithm performs better state-of-the-arts various evaluation metrics. also increases average precision around 2% detection system.",4 "face transfer generative adversarial network. face transfer animates facial performances character target video source actor. traditional methods typically based face modeling. propose end-to-end face transfer method based generative adversarial network. specifically, leverage cyclegan generate face image target character corresponding head pose facial expression source. order improve quality generated videos, adopt patchgan explore effect different receptive field sizes generated images.",4 "generating extractive summaries scientific paradigms. researchers scientists increasingly find position quickly understand large amounts technical material. goal effectively serve need using bibliometric text mining summarization techniques generate summaries scientific literature. show use citations produce automatically generated, readily consumable, technical extractive summaries. first propose c-lexrank, model summarizing single scientific articles based citations, employs community detection extracts salient information-rich sentences. next, extend experiments summarize set papers, cover scientific topic. generate extractive summaries set question answering (qa) dependency parsing (dp) papers, abstracts, citation sentences show citations unique information amenable creating summary.",4 "regression-based image alignment general object categories. gradient-descent methods exhibited fast reliable performance image alignment facial domain, largely ignored broader vision community. require image function smooth (numerically) differentiable -- properties hold pixel-based representations obeying natural image statistics, general classes non-linear feature transforms. show transforms dense sift incorporated lucas kanade alignment framework predicting descent directions via regression. enables robust matching instances general object categories whilst maintaining desirable properties lucas kanade capacity handle high-dimensional warp parametrizations fast rate convergence. present alignment results number objects imagenet, extension method unsupervised joint alignment objects corpus images.",4 "algorithm selection combinatorial search problems: survey. algorithm selection problem concerned selecting best algorithm solve given problem case-by-case basis. become especially relevant last decade, researchers increasingly investigating identify suitable existing algorithm solving problem instead developing new algorithms. survey presents overview work focusing contributions made area combinatorial search problems, algorithm selection techniques achieved significant performance improvements. unify organise vast literature according criteria determine algorithm selection systems practice. comprehensive classification approaches identifies analyses different directions algorithm selection approached. paper contrasts compares different methods solving problem well ways using solutions. closes identifying directions current future research.",4 "see tree lines: shazoo algorithm -- full version --. predicting nodes given graph fascinating theoretical problem applications several domains. since graph sparsification via spanning trees retains enough information making task much easier, trees important special case problem. although known predict nodes unweighted tree nearly optimal way, weighted case fully satisfactory algorithm available yet. fill hole introduce efficient node predictor, shazoo, nearly optimal weighted tree. moreover, show shazoo viewed common nontrivial generalization previous approaches unweighted trees weighted lines. experiments real-world datasets confirm shazoo performs well fully exploits structure input tree, gets close (and sometimes better than) less scalable energy minimization methods.",4 "affine-gradient based local binary pattern descriptor texture classiffication. present novel affine-gradient based local binary pattern (aglbp) descriptor texture classification. hard describe complicated texture using single type information, local binary pattern (lbp), utilizes sign information difference pixel local neighbors. descriptor three characteristics: 1) order make full use information contained texture, affine-gradient, different euclidean-gradient invariant affine transformation incorporated aglbp. 2) improved method proposed rotation invariance, depends reference direction calculating respect local neighbors. 3) feature selection method, considering statistical frequency intraclass variance training dataset, also applied reduce dimensionality descriptors. experiments three standard texture datasets, outex12, outex10 kth-tips2, conducted evaluate performance aglbp. results show proposed descriptor gets better performance comparing state-of-the-art rotation texture descriptors texture classification.",4 "bayesian matrix completion via adaptive relaxed spectral regularization. bayesian matrix completion studied based low-rank matrix factorization formulation promising results. however, little work done bayesian matrix completion based direct spectral regularization formulation. fill gap presenting novel bayesian matrix completion method based spectral regularization. order circumvent difficulties dealing orthonormality constraints singular vectors, derive new equivalent form relaxed constraints, leads us design adaptive version spectral regularization feasible bayesian inference. bayesian method requires parameter tuning infer number latent factors automatically. experiments synthetic real datasets demonstrate encouraging results rank recovery collaborative filtering, notably good results sparse matrices.",4 hybrid approach english-hindi name entity transliteration. machine translation (mt) research indian languages still infancy. much work done proper transliteration name entities domain. paper address issue. used english-hindi language pair experiments used hybrid approach. first processed english words using rule based approach extracts individual phonemes words applied statistical approach converts english equivalent hindi phoneme turn corresponding hindi word. approach attained 83.40% accuracy.,4 "quick energy-efficient bayesian computing binocular disparity using stochastic digital signals. reconstruction tridimensional geometry visual scene using binocular disparity information important issue computer vision mobile robotics, formulated bayesian inference problem. however, computation full disparity distribution advanced bayesian model usually intractable problem, proves computationally challenging even simple model. paper, show probabilistic hardware using distributed memory alternate representation data stochastic bitstreams solve problem high performance energy efficiency. put forward way express discrete probability distributions using stochastic data representations perform bayesian fusion using representations, show approach applied diparity computation. evaluate system using simulated stochastic implementation discuss possible hardware implementations architectures potential sensorimotor processing robotics.",4 "automated map reading: image based localisation 2-d maps using binary semantic descriptors. describe novel approach image based localisation urban environments using semantic matching images 2-d map. contrasts vast majority existing approaches use image image database matching. use highly compact binary descriptors represent semantic features locations, significantly increasing scalability compared existing methods potential greater invariance variable imaging conditions. approach also akin human map reading, making suited human-system interaction. binary descriptors indicate presence semantic features relating buildings road junctions discrete viewing directions. use cnn classifiers detect features images match descriptor estimates database location tagged descriptors derived 2-d map. isolation, descriptors sufficiently discriminative, concatenated sequentially along route, combination becomes highly distinctive allows localisation even using non-perfect classifiers. performance improved taking account left right turns route. experimental results obtained using google streetview openstreetmap data show approach considerable potential, achieving localisation accuracy around 85% using routes corresponding approximately 200 meters.",4 "rational competitive analysis. much work computer science adopted competitive analysis tool decision making uncertainty. work extend competitive analysis context multi-agent systems. unlike classical competitive analysis behavior agent's environment taken arbitrary, consider case agent's environment consists agents. agents usually obey (minimal) rationality constraints. leads definition rational competitive analysis. introduce concept rational competitive analysis, initiate study competitive analysis multi-agent systems. also discuss application rational competitive analysis context bidding games, well classical one-way trading problem.",4 "reasonable forced goal orderings use agenda-driven planning algorithm. paper addresses problem computing goal orderings, one longstanding issues ai planning. makes two new contributions. first, formally defines discusses two different goal orderings, called reasonable forced ordering. orderings defined simple strips operators well complex adl operators supporting negation conditional effects. complexity orderings investigated practical relevance discussed. secondly, two different methods compute reasonable goal orderings developed. one based planning graphs, investigates set actions directly. finally, shown ordering relations, derived given set goals g, used compute so-called goal agenda divides g ordered set subgoals. planner then, principle, use goal agenda plan increasing sets subgoals. lead exponential complexity reduction, solution complex planning problem found solving easier subproblems. since polynomial overhead caused goal agenda computation, potential exists dramatically speed planning algorithms demonstrate empirical evaluation, use method ipp planner.",4 "infinite variational autoencoder semi-supervised learning. paper presents infinite variational autoencoder (vae) whose capacity adapts suit input data. achieved using mixture model mixing coefficients modeled dirichlet process, allowing us integrate coefficients performing inference. critically, allows us automatically vary number autoencoders mixture based data. experiments show flexibility method, particularly semi-supervised learning, small number training samples available.",4 "text summarization using abstract meaning representation. ever increasing size text present internet, automatic summary generation remains important problem natural language understanding. work explore novel full-fledged pipeline text summarization intermediate step abstract meaning representation (amr). pipeline proposed us first generates amr graph input story, extracts summary graph finally, generate summary sentences summary graph. proposed method achieves state-of-the-art results compared text summarization routines based amr. also point significant problems existing evaluation methods, make unsuitable evaluating summary quality.",4 "monitoring chinese population migration consecutive weekly basis intra-city scale inter-province scale didi's bigdata. population migration valuable information leads proper decision urban-planning strategy, massive investment, many fields. instance, inter-city migration posterior evidence see government's constrain population works, inter-community immigration might prior evidence real estate price hike. timely data, also impossible compare city favorable people, suppose cities release different new regulations, could also compare customers different real estate development groups, come from, probably go. unfortunately data available. paper, leveraging data generated positioning team didi, propose novel approach timely monitoring population migration community scale provincial scale. migration detected soon week. could faster, setting week statistical purpose. monitoring system developed, applied nation wide china, observations derived system presented paper. new method migration perception origin insight nowadays people mostly moving personal access point (ap), also known wifi hotspot. assume ratio ap moving migration population constant, analysis comparative population migration would feasible. exact quantitative research would also done sample research model regression. procedures processing data includes many steps: eliminating impact pseudo-migration ap, instance pocket wifi, second-hand traded router; distinguishing moving population moving companies; identifying shifting ap finger print clusters, etc..",19 "semi-dense 3d semantic mapping monocular slam. bundle geometry appearance computer vision proven promising solution robots across wide variety applications. stereo cameras rgb-d sensors widely used realise fast 3d reconstruction trajectory tracking dense way. however, lack flexibility seamless switch different scaled environments, i.e., indoor outdoor scenes. addition, semantic information still hard acquire 3d mapping. address challenge combining state-of-art deep learning method semi-dense simultaneous localisation mapping (slam) based video stream monocular camera. approach, 2d semantic information transferred 3d mapping via correspondence connective keyframes spatial consistency. need obtain semantic segmentation frame sequence, could achieve reasonable computation time. evaluate method indoor/outdoor datasets lead improvement 2d semantic labelling baseline single frame predictions.",4 "improved eeg event classification using differential energy. feature extraction automatic classification eeg signals typically relies time frequency representations signal. techniques cepstral-based filter banks wavelets popular analysis techniques many signal processing applications including eeg classification. paper, present comparison variety approaches estimating postprocessing features. aid discrimination periodic signals aperiodic signals, add differential energy term. evaluate approaches tuh eeg corpus, largest publicly available eeg corpus exceedingly challenging task due clinical nature data. demonstrate variant standard filter bank-based approach, coupled first second derivatives, provides substantial reduction overall error rate. combination differential energy derivatives produces 24% absolute reduction error rate improves ability discriminate signal events background noise. relatively simple approach proves comparable popular feature extraction approaches wavelets, much computationally efficient.",6 "convolution aware initialization. initialization parameters deep neural networks shown big impact performance networks (mishkin & matas, 2015). initialization scheme devised et al, allowed convolution activations carry constrained mean allowed deep networks trained effectively (he et al., 2015a). orthogonal initializations generally orthogonal matrices standard recurrent networks proved eradicate vanishing exploding gradient problem (pascanu et al., 2012). majority current initialization schemes take fully account intrinsic structure convolution operator. using duality fourier transform convolution operator, convolution aware initialization builds orthogonal filters fourier space, using inverse fourier transform represents standard space. convolution aware initialization noticed higher accuracy lower loss, faster convergence. achieve new state art cifar10 dataset, achieve close state art various tasks.",4 "joint object category 3d pose estimation 2d images. 2d object detection task finding (i) objects present image (ii) located, 3d pose estimation task finding pose objects 3d space. state-of-the-art methods solving tasks follow two-stage approach 3d pose estimation system applied bounding boxes (with associated category labels) returned 2d detection method. paper addresses task joint object category 3d pose estimation given 2d bounding box. design residual network based architecture solve two seemingly orthogonal tasks new category-dependent pose outputs loss functions, show state-of-the-art performance challenging pascal3d+ dataset.",4 "fixing error caponnetto de vito (2007). seminal paper caponnetto de vito (2007) provides minimax-optimal rates kernel ridge regression general setting. proof, however, contains error bound effective dimensionality. note, explain mistake, provide correct bound, show main theorem remains true.",19 "one size fits many: column bundle multi-x learning. much recent machine learning research directed towards leveraging shared statistics among labels, instances data views, commonly referred multi-label, multi-instance multi-view learning. underlying premises exist correlations among input parts among output targets, predictive performance would increase correlations incorporated. paper, propose column bundle (clb), novel deep neural network capturing shared statistics data. clb generic architecture applied various types shared statistics changing input output handling. clb capable scaling thousands input parts output labels avoiding explicit modeling pairwise relations. evaluate clb different types data: (a) multi-label, (b) multi-view, (c) multi-view/multi-label (d) multi-instance. clb demonstrates comparable competitive performance datasets state-of-the-art methods designed specifically type.",19 "ranking pages topology popularity within web sites. compare two link analysis ranking methods web pages site. first, called site rank, adaptation pagerank granularity web site second, called popularity rank, based frequencies user clicks outlinks page captured navigation sessions users web site. ran experiments artificially created web sites different sizes two real data sets, employing relative entropy compare distributions two ranking methods. real data sets also employ nonparametric measure, called spearman's footrule, use compare top-ten web pages ranked two methods. main result distributions popularity rank site rank surprisingly close other, implying topology web site instrumental guiding users site. thus, practice, site rank provides reasonable first order approximation aggregate behaviour users within web site given popularity rank.",4 "kernel method detecting higher order interactions multi-view data: application imaging, genetics, epigenetics. study, tested interaction effect multimodal datasets using novel method called kernel method detecting higher order interactions among biologically relevant mulit-view data. using semiparametric method reproducing kernel hilbert space (rkhs), used standard mixed-effects linear model derived score-based variance component statistic tests higher order interactions multi-view data. proposed method offers intangible framework identification higher order interaction effects (e.g., three way interaction) genetics, brain imaging, epigenetic data. extensive numerical simulation studies first conducted evaluate performance method. finally, method evaluated using data mind clinical imaging consortium (mcic) including single nucleotide polymorphism (snp) data, functional magnetic resonance imaging (fmri) scans, deoxyribonucleic acid (dna) methylation data, respectfully, schizophrenia patients healthy controls. treated gene-derived snps, region interest (roi) gene-derived dna methylation single testing unit, combined triplets evaluation. addition, cardiovascular disease risk factors age, gender, body mass index assessed covariates hippocampal volume compared triplets. method identified $13$-triplets ($p$-values $\leq 0.001$) included $6$ gene-derived snps, $10$ rois, $6$ gene-derived dna methylations correlated changes hippocampal volume, suggesting triplets may important explaining schizophrenia-related neurodegeneration. strong evidence ($p$-values $\leq 0.000001$), triplet ({\bf magi2, crblcrus1.l, fbxo28}) potential distinguish schizophrenia patients healthy control variations.",19 "urban ozone concentration forecasting artificial neural network corsica. atmospheric pollutants concentration forecasting important issue air quality monitoring. qualitair corse, organization responsible monitoring air quality corsica (france) region, needs develop short-term prediction model lead mission information towards public. various deterministic models exist meso-scale local forecasting, need powerful large variable sets, good knowledge atmospheric processes, inaccurate local climatical geographical particularities, observed corsica, mountainous island located mediterranean sea. result, focus study statistical models, particularly artificial neural networks (ann) shown good results prediction ozone concentration horizon h+1 data measured locally. purpose study build predictor realize predictions ozone pm10 horizon d+1 corsica order able anticipate pollution peak formation take appropriated prevention measures. specific meteorological conditions known lead particular pollution event corsica (e.g. saharan dust event). therefore, several ann models used, meteorological conditions clustering operational forecasting.",4 "unsupervised dynamic image segmentation using fuzzy hopfield neural network based genetic algorithm. paper proposes genetic algorithm based segmentation method automatically segment gray-scale images. proposed method mainly consists spatial unsupervised grayscale image segmentation divides image regions. aim algorithm produce precise segmentation images using intensity information along neighborhood relationships. paper, fuzzy hopfield neural network (fhnn) clustering helps generating population genetic algorithm automatically segments image. technique powerful method image segmentation works single multiple-feature data spatial information. validity index utilized introducing robust technique finding optimum number components image. experimental results shown algorithm generates good quality segmented image.",4 "intrusion detection using continuous time bayesian networks. intrusion detection systems (idss) fall two high-level categories: network-based systems (nids) monitor network behaviors, host-based systems (hids) monitor system calls. work, present general technique systems. use anomaly detection, identifies patterns conforming historic norm. types systems, rates change vary dramatically time (due burstiness) components (due service difference). efficiently model systems, use continuous time bayesian networks (ctbns) avoid specifying fixed update interval common discrete-time models. build generative models normal training data, abnormal behaviors flagged based likelihood norm. nids, construct hierarchical ctbn model network packet traces use rao-blackwellized particle filtering learn parameters. illustrate power method experiments detecting real worms identifying hosts two publicly available network traces, mawi dataset lbnl dataset. hids, develop novel learning method deal finite resolution system log file time stamps, without losing benefits continuous time model. demonstrate method detecting intrusions darpa 1998 bsm dataset.",4 "distributed evolutionary k-way node separators. computing high quality node separators large graphs necessary variety applications, ranging divide-and-conquer algorithms vlsi design. work, present novel distributed evolutionary algorithm tackling k-way node separator problem. key component contribution includes new k-way local search algorithms based maximum flows. combine local search multilevel approach compute initial population evolutionary algorithm, show modify coarsening stage multilevel algorithm create effective combine mutation operations. lastly, combine techniques scalable communication protocol, producing system able compute high quality solutions short amount time. experiments competing algorithms show advanced evolutionary algorithm computes best result 94% chosen benchmark instances.",4 "instance similarity deep hashing multi-label image retrieval. hash coding widely used approximate nearest neighbor search large-scale image retrieval. recently, many deep hashing methods proposed shown largely improved performance traditional feature-learning-based methods. methods examine pairwise similarity semantic-level labels, pairwise similarity generally defined hard-assignment way. is, pairwise similarity '1' share less one class label '0' share any. however, similarity definition cannot reflect similarity ranking pairwise images hold multiple labels. paper, new deep hashing method proposed multi-label image retrieval re-defining pairwise similarity instance similarity, instance similarity quantified percentage based normalized semantic labels. based instance similarity, weighted cross-entropy loss minimum mean square error loss tailored loss-function construction, efficiently used simultaneous feature learning hash coding. experiments three popular datasets demonstrate that, proposed method outperforms competing methods achieves state-of-the-art performance multi-label image retrieval.",4 "comparison several reweighted l1-algorithms solving cardinality minimization problems. reweighted l1-algorithms attracted lot attention field applied mathematics. unified framework algorithms recently proposed zhao li. paper construct new examples reweighted l1-methods. functions certain concave approximations l0-norm function. focus numerical comparison new existing reweighted l1-algorithms. show change parameters reweighted algorithms may affect performance algorithms finding solution cardinality minimization problem. experiments, problem data generated according different statistical distributions, test algorithms different sparsity level solution problem. numerical results demonstrate reweighted l1-method one efficient methods locating solution cardinality minimization problem.",12 "learning compare: relation network few-shot learning. present conceptually simple, flexible, general framework few-shot learning, classifier must learn recognise new classes given examples each. method, called relation network (rn), trained end-to-end scratch. meta-learning, learns learn deep distance metric compare small number images within episodes, designed simulate few-shot setting. trained, rn able classify images new classes computing relation scores query images examples new class without updating network. besides providing improved performance few-shot learning, framework easily extended zero-shot learning. extensive experiments four datasets demonstrate simple approach provides unified effective approach two tasks.",4 "decision-theoretic planning concurrent temporally extended actions. investigate model planning uncertainty temporallyextended actions, multiple actions taken concurrently decision epoch. model based options framework, combines factored state space models,where set options partitioned classes affectdisjoint state variables. show set decisionepochs concurrent options defines semi-markov decisionprocess, underlying temporally extended actions parallelized arerestricted markov options. property allows us use smdpalgorithms computing value function concurrentoptions. concurrent options model allows overlapping execution ofoptions order achieve higher performance order performa complex task. describe simple experiment using navigationtask illustrates concurrent options results faster planwhen compared case one option taken time.",4 "correlation clustering noisy partial information. paper, propose study semi-random model correlation clustering problem arbitrary graphs g. give two approximation algorithms correlation clustering instances model. first algorithm finds solution value $(1+ \delta) optcost + o_{\delta}(n\log^3 n)$ high probability, $optcost$ value optimal solution (for every $\delta > 0$). second algorithm finds ground truth clustering arbitrarily small classification error $\eta$ (under additional assumptions instance).",4 "particle filtering audio localization manifold. present novel particle filtering algorithm tracking moving sound source using microphone array. n microphones array, track $n \choose 2$ delays single particle filter time. since known tracking high dimensions rife difficulties, instead integrate particle filter model low dimensional manifold delays lie on. manifold model based work modeling low dimensional manifolds via random projection trees [1]. addition, also introduce new weighting scheme particle filtering algorithm based recent advancements online learning. show novel tdoa tracking algorithm integrates manifold model greatly outperform standard particle filters audio tracking task.",4 "effective feature selection method based pair-wise feature proximity high dimensional low sample size data. feature selection studied widely literature. however, efficacy selection criteria low sample size applications neglected cases. existing feature selection criteria based sample similarity. however, distance measures become insignificant high dimensional low sample size (hdlss) data. moreover, variance feature samples pointless unless represents data distribution efficiently. instead looking samples groups, evaluate efficiency based pairwise fashion. investigation, noticed considering pair samples time selecting features bring closer put far away better choice feature selection. experimental results benchmark data sets demonstrate effectiveness proposed method low sample size, outperforms many state-of-the-art feature selection methods.",4 "progressive versus random projections compressive capture images, lightfields higher dimensional visual signals. computational photography involves sophisticated capture methods. new trend capture projection higher dimensional visual signals videos, multi-spectral data lightfields lower dimensional sensors. carefully designed capture methods exploit sparsity underlying signal transformed domain reduce number measurements use appropriate reconstruction method. traditional progressive methods may capture successively detail using sequence simple projection basis, dct wavelets employ straightforward backprojection reconstruction. randomized projection methods use specific sequence use l0 minimization reconstruction. paper, analyze statistical properties natural images, videos, multi-spectral data light-fields compare effectiveness progressive random projections. define effectiveness plotting reconstruction snr compression factor. key idea procedure measure best-case effectiveness fast, independent specific hardware independent reconstruction procedure. believe first empirical study compare different lossy capture strategies without complication hardware reconstruction ambiguity. scope limited linear non-adaptive sensing. results show random projections produce significant advantages projections higher dimensional signals, suggest research nascent adaptive non-linear projection methods.",4 "multi-level anomaly detection time-varying graph data. work presents novel modeling analysis framework graph sequences addresses challenge detecting contextualizing anomalies labelled, streaming graph data. introduce generalization bter model seshadhri et al. adding flexibility community structure, use model perform multi-scale graph anomaly detection. specifically, probability models describing coarse subgraphs built aggregating probabilities finer levels, closely related hierarchical models simultaneously detect deviations expectation. technique provides insight graph's structure internal context may shed light detected event. additionally, multi-scale analysis facilitates intuitive visualizations allowing users narrow focus anomalous graph particular subgraphs nodes causing anomaly. evaluation, two hierarchical anomaly detectors tested baseline gaussian method series sampled graphs. demonstrate graph statistics-based approach outperforms distribution-based detector baseline labeled setting community structure, accurately detects anomalies synthetic real-world datasets node, subgraph, graph levels. illustrate accessibility information made possible via technique, anomaly detector associated interactive visualization tool tested ncaa football data, teams conferences moved within league identified perfect recall, precision greater 0.786.",4 "multiple object tracking context awareness. multiple people tracking key problem many applications surveillance, animation car navigation, key input tasks activity recognition. crowded environments occlusions false detections common, although substantial advances recent years, tracking still challenging task. tracking typically divided two steps: detection, i.e., locating pedestrians image, data association, i.e., linking detections across frames form complete trajectories. data association task, approaches typically aim developing new, complex formulations, turn put focus optimization techniques required solve them. however, still utilize basic information distance detections. thesis, focus data association task argue contextual information fully exploited yet tracking community, mainly social context spatial context coming different views.",4 kernel diff-hash. paper presents kernel formulation recently introduced diff-hash algorithm construction similarity-sensitive hash functions. kernel diff-hash algorithm shows superior performance problem image feature descriptor matching.,4 "extractive summarization: limits, compression, generalized model heuristics. due promise alleviate information overload, text summarization attracted attention many researchers. however, remained serious challenge. here, first prove empirical limits recall (and f1-scores) extractive summarizers duc datasets rouge evaluation single-document multi-document summarization tasks. next define concept compressibility document present new model summarization, generalizes existing models literature integrates several dimensions summarization, viz., abstractive versus extractive, single versus multi-document, syntactic versus semantic. finally, examine new existing single-document summarization algorithms single framework compare state art summarizers duc data.",4 "language models image captioning: quirks works. two recent approaches achieved state-of-the-art results image captioning. first uses pipelined process set candidate words generated convolutional neural network (cnn) trained images, maximum entropy (me) language model used arrange words coherent sentence. second uses penultimate activation layer cnn input recurrent neural network (rnn) generates caption sequence. paper, compare merits different language modeling approaches first time using state-of-the-art cnn input. examine issues different approaches, including linguistic irregularities, caption repetition, data set overlap. combining key aspects rnn methods, achieve new record performance previously published results benchmark coco dataset. however, gains see bleu translate human judgments.",4 "ambiguity language networks. human language defines complex outcomes evolution. emergence elaborated form communication allowed humans create extremely structured societies manage symbols different levels including, among others, semantics. linguistic levels deal astronomic combinatorial potential stems recursive nature languages. recursiveness indeed key defining trait. however, words equally combined frequent. breaking symmetry less often used less meaning-bearing units, universal scaling laws arise. laws, common human languages, appear different stages word inventories networks interacting words. among seemingly universal traits exhibited language networks, ambiguity appears specially relevant component. ambiguity avoided computational approaches language processing, yet seems crucial element language architecture. review evidence language network architecture theoretical reasonings based least effort argument. ambiguity shown play essential role providing source language efficiency, likely inevitable byproduct network growth.",15 "linguistic descriptions data help teaching-learning process higher education, case study: artificial intelligence. artificial intelligence central topic computer science curriculum. year 2011 project-based learning methodology based computer games designed implemented intelligence artificial course university bio-bio. project aims develop software-controlled agents (bots) programmed using heuristic algorithms seen course. methodology allows us obtain good learning results, however several challenges founded implementation. paper show linguistic descriptions data help provide students teachers technical personalized feedback learned algorithms. algorithm behavior profile new turing test computer games bots based linguistic modelling complex phenomena also proposed order deal challenges. order show explore possibilities new technology, web platform designed implemented one authors incorporation process assessment allows us improve teaching learning process.",4 "dynamic controllability conditional stns uncertainty. recent attempts automate business processes medical-treatment processes uncovered need formal framework accommodate temporal constraints, also observations actions uncontrollable durations. meet need, paper defines conditional simple temporal network uncertainty (cstnu) combines simple temporal constraints simple temporal network (stn) conditional nodes conditional simple temporal problem (cstp) contingent links simple temporal network uncertainty (stnu). notion dynamic controllability cstnu defined generalizes dynamic consistency ctp dynamic controllability stnu. paper also presents sound constraint-propagation rules dynamic controllability expected form backbone dynamic-controllability-checking algorithm cstnus.",4 "adversarial examples easily detected: bypassing ten detection methods. neural networks known vulnerable adversarial examples: inputs close natural inputs classified incorrectly. order better understand space adversarial examples, survey ten recent proposals designed detection compare efficacy. show defeated constructing new loss functions. conclude adversarial examples significantly harder detect previously appreciated, properties believed intrinsic adversarial examples fact not. finally, propose several simple guidelines evaluating future proposed defenses.",4 "web-scale training face identification. scaling machine learning methods large datasets attracted considerable attention recent years, thanks easy access ubiquitous sensing data web. study face recognition show three distinct properties surprising effects transferability deep convolutional networks (cnn): (1) bottleneck network serves important transfer learning regularizer, (2) contrast common wisdom, performance saturation may exist cnn's (as number training samples grows); propose solution alleviating replacing naive random subsampling training set bootstrapping process. moreover, (3) find link representation norm ability discriminate target domain, sheds lights networks represent faces. based discoveries, able improve face recognition accuracy widely used lfw benchmark, verification (1:1) identification (1:n) protocols, directly compare, first time, state art commercially-off-the-shelf system show sizable leap performance.",4 "general framework development cortex-like visual object recognition system: waves spikes, predictive coding universal dictionary features. study focused development cortex-like visual object recognition system. propose general framework, consists three hierarchical levels (modules). modules functionally correspond v1, v4 areas. bottom-up top-down connections hierarchical levels v4 employed. higher degree matching input preferred stimulus, shorter response time neuron. therefore information single stimulus distributed time transmitted waves spikes. reciprocal connections waves spikes implement predictive coding: initial hypothesis generated basis information delivered first wave spikes tested information carried consecutive waves. development considered extraction accumulation features v4 objects it. stored feature disposed, rarely activated. cause update feature repository. consequently, objects also updated. illustrates growing process dynamical change topological structures v4, connections areas.",4 "sldr-dl: framework sld-resolution deep learning. paper introduces sld-resolution technique based deep learning. technique enables neural networks learn old successful resolution processes use learnt experiences guide new resolution processes. implementation technique named sldr-dl. includes prolog library deep feedforward neural networks essential functions resolution. sldr-dl framework, users define logical rules form definite clauses teach neural networks use rules reasoning processes.",4 "single multiple illuminant estimation using convolutional neural networks. paper present method estimation color illuminant raw images. method includes convolutional neural network specially designed produce multiple local estimates. multiple illuminant detector determines whether local outputs network must aggregated single estimate. evaluated method standard datasets single multiple illuminants, obtaining lower estimation errors respect obtained general purpose methods state art.",4 tail inequality quadratic forms subgaussian random vectors. prove exponential probability tail inequality positive semidefinite quadratic forms subgaussian random vector. bound analogous one holds vector independent gaussian entries.,12 "prodige: prioritization disease genes multitask machine learning positive unlabeled examples. elucidating genetic basis human diseases central goal genetics molecular biology. traditional linkage analysis modern high-throughput techniques often provide long lists tens hundreds disease gene candidates, identification disease genes among candidates remains time-consuming expensive. efficient computational methods therefore needed prioritize genes within list candidates, exploiting wealth information available genes various databases. propose prodige, novel algorithm prioritization disease genes. prodige implements novel machine learning strategy based learning positive unlabeled examples, allows integrate various sources information genes, share information known disease genes across diseases, perform genome-wide searches new disease genes. experiments real data show prodige outperforms state-of-the-art methods prioritization genes human diseases.",16 "exact tensor completion using t-svd. paper focus problem completion multidimensional arrays (also referred tensors) limited sampling. approach based recently proposed tensor-singular value decomposition (t-svd) [1]. using factorization one derive notion tensor rank, referred tensor tubal rank, optimality properties similar matrix rank derived svd. shown [2] multidimensional data, panning video sequences exhibit low tensor tubal rank look problem completing data random sampling data cube. show solving convex optimization problem, minimizes tensor nuclear norm obtained convex relaxation tensor tubal rank, one guarantee recovery overwhelming probability long samples proportion degrees freedom t-svd observed. sense results order-wise optimal. conditions result holds similar incoherency conditions matrix completion, albeit define incoherency algebraic set-up t-svd. show performance algorithm real data sets compare existing approaches based tensor flattening tucker decomposition.",4 "summarizing decisions spoken meetings. paper addresses problem summarizing decisions spoken meetings: goal produce concise {\it decision abstract} meeting decision. explore compare token-level dialogue act-level automatic summarization methods using unsupervised supervised learning frameworks. supervised summarization setting, given true clusterings decision-related utterances, find token-level summaries employ discourse context approach upper bound decision abstracts derived directly dialogue acts. unsupervised summarization setting,we find summaries based unsupervised partitioning decision-related utterances perform comparably based partitions generated using supervised techniques (0.22 rouge-f1 using lda-based topic models vs. 0.23 using svms).",4 "segan: adversarial network multi-scale $l_1$ loss medical image segmentation. inspired classic generative adversarial networks (gan), propose novel end-to-end adversarial neural network, called segan, task medical image segmentation. since image segmentation requires dense, pixel-level labeling, single scalar real/fake output classic gan's discriminator may ineffective producing stable sufficient gradient feedback networks. instead, use fully convolutional neural network segmentor generate segmentation label maps, propose novel adversarial critic network multi-scale $l_1$ loss function force critic segmentor learn global local features capture long- short-range spatial relationships pixels. segan framework, segmentor critic networks trained alternating fashion min-max game: critic takes input pair images, (original_image $*$ predicted_label_map, original_image $*$ ground_truth_label_map), trained maximizing multi-scale loss function; segmentor trained gradients passed along critic, aim minimize multi-scale loss function. show segan framework effective stable segmentation task, leads better performance state-of-the-art u-net segmentation method. tested segan method using datasets miccai brats brain tumor segmentation challenge. extensive experimental results demonstrate effectiveness proposed segan multi-scale loss: brats 2013 segan gives performance comparable state-of-the-art whole tumor tumor core segmentation achieves better precision sensitivity gd-enhance tumor core segmentation; brats 2015 segan achieves better performance state-of-the-art dice score precision.",4 "end-to-end deep reinforcement learning lane keeping assist. reinforcement learning considered strong ai paradigm used teach machines interaction environment learning mistakes, yet successfully used automotive applications. recently revival interest topic, however, driven ability deep learning algorithms learn good representations environment. motivated google deepmind's successful demonstrations learning games breakout go, propose different methods autonomous driving using deep reinforcement learning. particular interest difficult pose autonomous driving supervised learning problem strong interaction environment including vehicles, pedestrians roadworks. relatively new area research autonomous driving, formulate two main categories algorithms: 1) discrete actions category, 2) continuous actions category. discrete actions category, deal deep q-network algorithm (dqn) continuous actions category, deal deep deterministic actor critic algorithm (ddac). addition that, also discover performance two categories open source car simulator racing called (torcs) stands open racing car simulator. simulation results demonstrate learning autonomous maneuvering scenario complex road curvatures simple interaction vehicles. finally, explain effect restricted conditions, put car learning phase, convergence time finishing learning phase.",19 "double sparse multi-frame image super resolution. large number image super resolution algorithms based sparse coding proposed, algorithms realize multi-frame super resolution. multi-frame super resolution based sparse coding, accurate image registration sparse coding required. previous study multi-frame super resolution based sparse coding firstly apply block matching image registration, followed sparse coding enhance image resolution. paper, two problems solved optimizing single objective function. results numerical experiments support effectiveness proposed approch.",4 "word learning infinite uncertainty. language learners must learn meanings many thousands words, despite words occurring complex environments infinitely many meanings might inferred learner word's true meaning. problem infinite referential uncertainty often attributed willard van orman quine. provide mathematical formalisation ideal cross-situational learner attempting learn infinite referential uncertainty, identify conditions word learning possible. quine's intuitions suggest, learning infinite uncertainty fact possible, provided learners means ranking candidate word meanings terms plausibility; furthermore, analysis shows ranking could fact exceedingly weak, implying constraints allow learners infer plausibility candidate word meanings could weak. approach lifts burden explanation `smart' word learning constraints learners, suggests programme research weak, unreliable, probabilistic constraints inference word meaning real word learners.",15 "defensive forecasting optimal prediction expert advice. method defensive forecasting applied problem prediction expert advice binary outcomes. turns defensive forecasting competitive aggregating algorithm also handles case ""second-guessing"" experts, whose advice depends learner's prediction; paper assumes dependence learner's prediction continuous.",4 "consistency auc pairwise optimization. auc (area roc curve) important evaluation criterion, popularly used many learning tasks class-imbalance learning, cost-sensitive learning, learning rank, etc. many learning approaches try optimize auc, owing non-convexity discontinuousness auc, almost approaches work surrogate loss functions. thus, consistency auc crucial; however, almost untouched before. paper, provide sufficient condition asymptotic consistency learning approaches based surrogate loss functions. based result, prove exponential loss logistic loss consistent auc, hinge loss inconsistent. then, derive $q$-norm hinge loss general hinge loss consistent auc. also derive consistent bounds exponential loss logistic loss, obtain consistent bounds many surrogate loss functions non-noise setting. further, disclose equivalence exponential surrogate loss auc exponential surrogate loss accuracy, one straightforward consequence finding adaboost rankboost equivalent.",4 "monocular visual odometry unmanned sea-surface vehicle. tackle problem localizing autonomous sea-surface vehicle river estuarine areas using monocular camera angular velocity input inertial sensor. method challenged two prominent drawbacks associated environment, typically present standard visual simultaneous localization mapping (slam) applications land (or air): a) scene depth varies significantly (from meters several kilometers) and, b) conjunction latter, exists ground plane provide features enough disparity based reliably detect motion. end, use imu orientation feedback order re-cast problem visual localization without mapping component, although map implicitly obtained camera pose estimates. find method produces reliable odometry estimates trajectories several hundred meters long water. compare visual odometry estimates gps based ground truth, interpolate trajectory splines common parameter obtain position error meters recovering optimal affine transformation two splines.",4 "sparse communication distributed gradient descent. make distributed stochastic gradient descent faster exchanging sparse updates instead dense updates. gradient updates positively skewed updates near zero, map 99% smallest updates (by absolute value) zero exchange sparse matrices. method combined quantization improve compression. explore different configurations apply neural machine translation mnist image classification tasks. configurations work mnist, whereas different configurations reduce convergence rate complex translation task. experiments show achieve 49% speed mnist 22% nmt without damaging final accuracy bleu.",4 "sequence-to-sequence generation spoken dialogue via deep syntax trees strings. present natural language generator based sequence-to-sequence approach trained produce natural language strings well deep syntax dependency trees input dialogue acts, use directly compare two-step generation separate sentence planning surface realization stages joint, one-step approach. able train setups successfully using little training data. joint setup offers better performance, surpassing state-of-the-art regards n-gram-based scores providing relevant outputs.",4 "fast approximate bayesian computation estimating parameters differential equations. approximate bayesian computation (abc) using sequential monte carlo method provides comprehensive platform parameter estimation, model selection sensitivity analysis differential equations. however, method, like monte carlo methods, incurs significant computational cost requires explicit numerical integration differential equations carry inference. paper propose novel method circumventing requirement explicit integration using derivatives gaussian processes smooth observations parameters estimated. evaluate methods using synthetic data generated model biological systems described ordinary delay differential equations. upon comparing performance method existing abc techniques, demonstrate produces comparably reliable parameter estimates significantly reduced execution time.",19 "formal measure machine intelligence. fundamental problem artificial intelligence nobody really knows intelligence is. problem especially acute need consider artificial systems significantly different humans. paper approach problem following way: take number well known informal definitions human intelligence given experts, extract essential features. mathematically formalised produce general measure intelligence arbitrary machines. believe measure formally captures concept machine intelligence broadest reasonable sense.",4 "text2shape: generating shapes natural language learning joint embeddings. present method generating colored 3d shapes natural language. end, first learn joint embeddings freeform text descriptions colored 3d shapes. model combines extends learning association metric learning approaches learn implicit cross-modal connections, produces joint representation captures many-to-many relations language physical properties 3d shapes color shape. evaluate approach, collect large dataset natural language descriptions physical 3d objects shapenet dataset. learned joint embedding demonstrate text-to-shape retrieval outperforms baseline approaches. using embeddings novel conditional wasserstein gan framework, generate colored 3d shapes text. method first connect natural language text realistic 3d objects exhibiting rich variations color, texture, shape detail. see video https://youtu.be/zrapvrdl13q",4 "practical method solving contextual bandit problems using decision trees. many efficient algorithms strong theoretical guarantees proposed contextual multi-armed bandit problem. however, applying algorithms practice difficult require domain expertise build appropriate features tune parameters. propose new method contextual bandit problem simple, practical, applied little domain expertise. algorithm relies decision trees model context-reward relationship. decision trees non-parametric, interpretable, work well without hand-crafted features. guide exploration-exploitation trade-off, use bootstrapping approach abstracts thompson sampling non-bayesian settings. also discuss several computational heuristics demonstrate performance method several datasets.",4 "entropy analysis word-length series natural language texts: effects text language genre. estimate $n$-gram entropies natural language texts word-length representation find sensitive text language genre. attribute sensitivity changes probability distribution lengths single words emphasize crucial role uniformity probabilities words length five ten. furthermore, comparison entropies shuffled data reveals impact word length correlations estimated $n$-gram entropies.",4 "high resolution face completion multiple controllable attributes via fully end-to-end progressive generative adversarial networks. present deep learning approach high resolution face completion multiple controllable attributes (e.g., male smiling) arbitrary masks. face completion entails understanding structural meaningfulness appearance consistency locally globally fill ""holes"" whose content appear elsewhere input image. challenging task difficulty level increasing significantly respect high resolution, complexity ""holes"" controllable attributes filled-in fragments. system addresses challenges learning fully end-to-end framework trains generative adversarial networks (gans) progressively low resolution high resolution conditional vectors encoding controllable attributes. design novel network architectures exploit information across multiple scales effectively efficiently. introduce new loss functions encouraging sharp completion. show system complete faces large structural appearance variations using single feed-forward pass computation mean inference time 0.007 seconds images 1024 x 1024 resolution. also perform pilot human study shows approach outperforms state-of-the-art face completion methods terms rank analysis. code released upon publication.",4 "adversarial feature learning. ability generative adversarial networks (gans) framework learn generative models mapping simple latent distributions arbitrarily complex data distributions demonstrated empirically, compelling results showing latent space generators captures semantic variation data distribution. intuitively, models trained predict semantic latent representations given data may serve useful feature representations auxiliary problems semantics relevant. however, existing form, gans means learning inverse mapping -- projecting data back latent space. propose bidirectional generative adversarial networks (bigans) means learning inverse mapping, demonstrate resulting learned feature representation useful auxiliary supervised discrimination tasks, competitive contemporary approaches unsupervised self-supervised feature learning.",4 "lstm networks data-aware remaining time prediction business process instances. predicting completion time business process instances would helpful aid managing processes service level agreement constraints. ability know advance trend running process instances would allow business managers react time, order prevent delays undesirable situations. however, making accurate forecasts easy: many factors may influence required time complete process instance. paper, propose approach based deep recurrent neural networks (specifically lstms) able exploit arbitrary information associated single events, order produce as-accurate-as-possible prediction completion time running instances. experiments real-world datasets confirm quality proposal.",4 "causal rule sets identifying subgroups enhanced treatment effect. introduce novel generative model interpretable subgroup analysis causal inference applications, causal rule sets (crs). crs model uses small set short rules capture subgroup average treatment effect elevated compared entire population. present bayesian framework learning causal rule set. bayesian framework consists prior favors simpler models bayesian logistic regression characterizes relation outcomes, attributes subgroup membership. find maximum posteriori models using discrete monte carlo steps joint solution space rules sets parameters. provide theoretically grounded heuristics bounding strategies improve search efficiency. experiments show search algorithm efficiently recover true underlying subgroup crs shows consistently competitive performance compared state-of-the-art baseline methods.",4 "neural paraphrase generation stacked residual lstm networks. paper, propose novel neural approach paraphrase generation. conventional para- phrase generation methods either leverage hand-written rules thesauri-based alignments, use statistical machine learning principles. best knowledge, work first explore deep learning models paraphrase generation. primary contribution stacked residual lstm network, add residual connections lstm layers. allows efficient training deep lstms. evaluate model state-of-the-art deep learning models three different datasets: ppdb, wikianswers mscoco. evaluation results demonstrate model outperforms sequence sequence, attention-based bi- directional lstm models bleu, meteor, ter embedding-based sentence similarity metric.",4 "optimal allocation strategies dark pool problem. study problem allocating stocks dark pools. propose analyze optimal approach allocations, continuous-valued allocations allowed. also propose modification case integer-valued allocations possible. extend previous work problem adversarial scenarios, also improving results iid setup. resulting algorithms efficient, perform well simulations stochastic adversarial inputs.",19 "active user authentication smartphones: challenge data set benchmark results. paper, automated user verification techniques smartphones investigated. unique non-commercial dataset, university maryland active authentication dataset 02 (umdaa-02) multi-modal user authentication research introduced. paper focuses three sensors - front camera, touch sensor location service providing general description modalities. benchmark results face detection, face verification, touch-based user identification location-based next-place prediction presented, indicate robust methods fine-tuned mobile platform needed achieve satisfactory verification accuracy. dataset made available research community promoting additional research.",4 "hybrid decision support system : application healthcare. many systems based knowledge, especially expert systems medical decision support developed. systems based production rules, cannot learn evolve updating them. addition, taking account several criteria induces exorbitant number rules injected system. becomes difficult translate medical knowledge support decision simple rule. moreover, reasoning based generic cases became classic even reduce range possible solutions. remedy that, propose approach based using multi-criteria decision guided case-based reasoning (cbr) approach.",4 "facial expression detection using patch-based eigen-face isomap networks. automated facial expression detection problem pose two primary challenges include variations expression facial occlusions (glasses, beard, mustache face covers). paper introduce novel automated patch creation technique masks particular region interest face, followed eigen-value decomposition patched faces generation isomaps detect underlying clustering patterns among faces. proposed masked eigen-face based isomap clustering technique achieves 75% sensitivity 66-73% accuracy classification faces occlusions smiling faces around 1 second per image. also, betweenness centrality, eigen centrality maximum information flow used network-based measures identify significant training faces expression classification tasks. proposed method used combination feature-based expression classification methods large data sets improving expression classification accuracies.",4 "matching-based selection incomplete lists decomposition multi-objective optimization. balance convergence diversity key issue evolutionary multi-objective optimization. recently proposed stable matching-based selection provides new perspective handle balance framework decomposition multi-objective optimization. particular, stable matching subproblems solutions, achieves equilibrium mutual preferences, implicitly strikes balance convergence diversity. nevertheless, original stable matching model high risk matching solution unfavorable subproblem finally leads imbalanced selection result. paper, propose adaptive two-level stable matching-based selection decomposition multi-objective optimization. specifically, borrowing idea stable matching incomplete lists, match solution one favorite subproblems restricting length preference list first-level stable matching. second-level stable matching, remaining subproblems thereafter matched favorite solutions according classic stable matching model. particular, develop adaptive mechanism automatically set length preference list solution according local competitiveness. performance proposed method validated compared several state-of-the-art evolutionary multi-objective optimization algorithms 62 benchmark problem instances. empirical results fully demonstrate competitive performance proposed method problems complicated pareto sets three objectives.",4 "existence finiteness conditions risk-sensitive planning: results conjectures. decision-theoretic planning risk-sensitive planning objectives important building autonomous agents decision-support systems real-world applications. however, line research largely ignored artificial intelligence operations research communities since planning risk-sensitive planning objectives complicated planning risk-neutral planning objectives. remedy situation, derive conditions guarantee optimal expected utilities total plan-execution reward exist finite fully observable markov decision process models non-linear utility functions. case markov decision process models positive negative rewards, results hold stationary policies only, conjecture generalized non stationary policies.",4 "belief revision: critique. examine carefully rationale underlying approaches belief change taken literature, highlight view methodological problems. argue study belief change carefully, must quite explicit ``ontology'' scenario underlying belief change process. something missing previous work, focus postulates. analysis shows must pay particular attention two issues often taken granted: first model agent's epistemic state. (do use set beliefs, richer structure, ordering worlds? use set beliefs, language beliefs expressed?) show even postulates called ``beyond controversy'' unreasonable agent's beliefs include beliefs epistemic state well external world. second status observations. (are observations known true, believed? latter case, firm belief?) issues regarding status observations arise particularly consider iterated belief revision, must confront possibility revising p not-p.",4 "algebras measurements: logical structure quantum mechanics. quantum physics, measurement represented projection closed subspace hilbert space. study algebras operators abstract algebra projections closed subspaces hilbert space. properties operators justified epistemological grounds. commutation measurements central topic interest. classical logical systems may viewed measurement algebras measurements commute. keywords: quantum measurements, measurement algebras, quantum logic. pacs: 02.10.-v.",18 "study topological descriptors analysis 3d surface texture. methods computational topology becoming popular computer vision shown improve state-of-the-art several tasks. paper, investigate applicability topological descriptors context 3d surface analysis classification different surface textures. present comprehensive study topological descriptors, investigate robustness expressiveness compare state-of-the-art methods including convolutional neural networks (cnns). results show class-specific information reflected well topological descriptors. investigated descriptors directly compete non-topological descriptors capture complementary information. consequence improve state-of-the-art combined non-topological descriptors.",4 "group factor analysis. factor analysis provides linear factors describe relationships individual variables data set. extend classical formulation linear factors describe relationships groups variables, group represents either set related variables data set. model also naturally extends canonical correlation analysis two sets, way flexible previous extensions. solution formulated variational inference latent variable model structural sparsity, consists two hierarchical levels: higher level models relationships groups, whereas lower models observed variables given higher level. show resulting solution solves group factor analysis problem accurately, outperforming alternative factor analysis based solutions well straightforward implementations group factor analysis. method demonstrated two life science data sets, one brain activation systems biology, illustrating applicability analysis different types high-dimensional data sources.",19 "construction non-convex polynomial loss functions training binary classifier quantum annealing. quantum annealing heuristic quantum algorithm exploits quantum resources minimize objective function embedded energy levels programmable physical system. take advantage potential quantum advantage, one needs able map problem interest native hardware reasonably low overhead. experimental considerations constrain objective function take form low degree pubo (polynomial unconstrained binary optimization), employ non-convex loss functions polynomial functions margin. show loss functions robust label noise provide clear advantage convex methods. loss functions may also useful classical approaches compile regularized risk expressions evaluated constant time respect number training examples.",4 "alternating optimization method based nonnegative matrix factorizations deep neural networks. backpropagation algorithm calculating gradients widely used computation weights deep neural networks (dnns). method requires derivatives objective functions difficulties finding appropriate parameters learning rate. paper, propose novel approach computing weight matrices fully-connected dnns using two types semi-nonnegative matrix factorizations (semi-nmfs). method, optimization processes performed calculating weight matrices alternately, backpropagation (bp) used. also present method calculate stacked autoencoder using nmf. output results autoencoder used pre-training data dnns. experimental results show method using three types nmfs attains similar error rates conventional dnns bp.",4 "logical n-and gate molecular turing machine. boolean algebra, known logical function corresponds negation conjunction --nand-- universal sense logical function built based it. property makes essential modern digital electronics computer processor design. here, design molecular turing machine computes nand function binary strings arbitrary length. purpose, perform mathematical abstraction kind operations done double-stranded dna molecule, well presenting molecular encoding input symbols machine.",4 "stochastic pooling regularization deep convolutional neural networks. introduce simple effective method regularizing large convolutional neural networks. replace conventional deterministic pooling operations stochastic procedure, randomly picking activation within pooling region according multinomial distribution, given activities within pooling region. approach hyper-parameter free combined regularization approaches, dropout data augmentation. achieve state-of-the-art performance four image datasets, relative approaches utilize data augmentation.",4 "local optima learning bayesian networks. paper proposes evaluates k-greedy equivalence search algorithm (kes) learning bayesian networks (bns) complete data. main characteristic kes allows trade-off greediness randomness, thus exploring different good local optima. greediness set maximum, kes corresponds greedy equivalence search algorithm (ges). greediness kept minimum, prove mild assumptions kes asymptotically returns inclusion optimal bn nonzero probability. experimental results synthetic real data reported showing kes often finds better local optima ges. moreover, use kes experimentally confirm number different local optima often huge.",4 "modified mel filter bank compute mfcc subsampled speech. mel frequency cepstral coefficients (mfccs) popularly used speech features speech speaker recognition applications. work, propose modified mel filter bank extract mfccs subsampled speech. also propose stronger metric effectively captures correlation mfccs original speech mfcc resampled speech. found proposed method filter bank construction performs distinguishably well gives recognition performance resampled speech close recognition accuracies original speech.",4 "efficient fpga implementation mri image filtering tumor characterization using xilinx system generator. paper presents efficient architecture various image filtering algorithms tumor characterization using xilinx system generator (xsg). architecture offers alternative graphical user interface combines matlab, simulink xsg explores important aspects concerned hardware implementation. performance architecture implemented spartan-3e starter kit (xc3s500e-fg320) exceeds similar greater resources architectures. proposed architecture reduces resources available target device 50%.",4 "using neural networks improve classical operating system fingerprinting techniques. present remote operating system detection inference problem: given set observations (the target host responses set tests), want infer os type probably generated observations. classical techniques used perform analysis present several limitations. improve analysis, developed tools using neural networks statistics tools. present two working modules: one uses dce-rpc endpoints distinguish windows versions, another uses nmap signatures distinguish different version windows, linux, solaris, openbsd, freebsd netbsd systems. explain details topology inner workings neural networks used, fine tuning parameters. finally show positive experimental results.",4 "l1-regularized distributed optimization: communication-efficient primal-dual framework. despite importance sparsity many large-scale applications, methods distributed optimization sparsity-inducing objectives. paper, present communication-efficient framework l1-regularized optimization distributed environment. viewing classical objectives general primal-dual setting, develop new class methods efficiently distributed applied common sparsity-inducing models, lasso, sparse logistic regression, elastic net-regularized problems. provide theoretical convergence guarantees framework, demonstrate efficiency flexibility thorough experimental comparison amazon ec2. proposed framework yields speedups 50x compared current state-of-the-art methods distributed l1-regularized optimization.",4 "deep learning medical image analysis. report describes research activities hasso plattner institute summarizes ph.d. plan several novels, end-to-end trainable approaches analyzing medical images using deep learning algorithm. report, example, explore different novel methods based deep learning brain abnormality detection, recognition, segmentation. report prepared doctoral consortium aime-2017 conference.",4 "dynamic high resolution deformable articulated tracking. last several years seen significant progress using depth cameras tracking articulated objects human bodies, hands, robotic manipulators. approaches focus tracking skeletal parameters fixed shape model, makes insufficient applications require accurate estimates deformable object surfaces. overcome limitation, present 3d model-based tracking system articulated deformable objects. system able track human body pose high resolution surface contours real time using commodity depth sensor gpu hardware. implement joint optimization skeleton account changes pose, vertices high resolution mesh track subject's shape. experimental results show able capture dynamic sub-centimeter surface detail folds wrinkles clothing. also show shape estimation aids kinematic pose estimation providing accurate target match point cloud. end result highly accurate spatiotemporal semantic information well suited physical human robot interaction well virtual augmented reality systems.",4 "egocentric height estimation. egocentric, first-person vision became popular recent years emerge wearable technology, different exocentric (third-person) vision distinguishable ways, one camera wearer generally visible video frames. recent work done action object recognition egocentric videos, well work biometric extraction first-person videos. height estimation useful feature soft-biometrics object tracking. here, propose method estimating height egocentric camera without calibration reference points. used traditional computer vision approaches deep learning order determine visual cues results best height estimation. here, introduce framework inspired two stream networks comprising two convolutional neural networks, one based spatial information, one based information given optical flow frame. given egocentric video input framework, model yields height estimate output. also incorporate late fusion learn combination temporal spatial cues. comparing model methods used baselines, achieve height estimates videos mean average error 14.04 cm range 103 cm data, classification accuracy relative height (tall, medium short) 93.75% chance level 33%.",4 "exploiting qualitative knowledge learning conditional probabilities bayesian networks. algorithms learning conditional probabilities bayesian networks hidden variables typically operate within high-dimensional search space yield locally optimal solutions. one way limiting search space avoiding local optima impose qualitative constraints based background knowledge concerning domain. present method integrating formal statements qualitative constraints two learning algorithms, apn em. experiments synthetic data, method yielded networks satisfied constraints almost perfectly. accuracy learned networks consistently superior corresponding networks learned without constraints. exploitation qualitative constraints therefore appears promising way increase interpretability accuracy learned bayesian networks known structure.",4 "fca - approach leach protocol wireless sensor networks using fuzzy logic. order gather information efficiently, wireless sensor networks partitioned clusters. proposed clustering algorithms consider location base station. situation causes hot spots problem multi-hop wireless sensor networks. paper, propose fuzzy clustering algorithm (fca) aims prolong lifetime wireless sensor networks. fca adjusts cluster-head radius considering residual energy distance base station parameters sensor nodes. helps decreasing intra-cluster work sensor nodes closer base station lower battery level. utilize fuzzy logic handling uncertainties cluster-head radius estimation. compare algorithm leach according first node dies, half nodes alive energy-efficiency metrics. simulation results show fca performs better algorithms cases. therefore, proposed algorithm stable energy-efficient clustering algorithm.",4 "combining multi-level contexts superpixel using convolutional neural networks perform natural scene labeling. modern deep learning algorithms triggered various image segmentation approaches. however deal pixel based segmentation. however, superpixels provide certain degree contextual information reducing computation cost. approach, performed superpixel level semantic segmentation considering 3 various levels neighbours semantic contexts. furthermore, enlisted number ensemble approaches like max-voting weighted-average. also used dempster-shafer theory uncertainty analyze confusion among various classes. method proved superior number different modern approaches dataset.",4 "implicit segmentation kannada characters offline handwriting recognition using hidden markov models. describe method classification handwritten kannada characters using hidden markov models (hmms). kannada script agglutinative, simple shapes concatenated horizontally form character. results large number characters making task classification difficult. character segmentation plays significant role reducing number classes. explicit segmentation techniques suffer overlapping shapes present, common case handwritten text. use hmms take advantage agglutinative nature kannada script, allows us perform implicit segmentation characters along recognition. experiments performed chars74k dataset consists 657 handwritten characters collected across multiple users. gradient-based features extracted individual characters used train character hmms. use implicit segmentation technique character level resulted improvement around 10%. system also outperformed existing system tested dataset around 16%. analysis based learning curves showed increasing training data could result better accuracy. accordingly, collected additional data obtained improvement 4% 6 additional samples.",4 "decision aids adversarial planning military operations: algorithms, tools, turing-test-like experimental validation. use intelligent decision aids help alleviate challenges planning complex operations. describe integrated algorithms, tool capable translating high-level concept tactical military operation fully detailed, actionable plan, producing automatically (or human guidance) plans realistic degree detail human-like quality. tight interleaving several algorithms -- planning, adversary estimates, scheduling, routing, attrition consumption estimates -- comprise computational approach tool. although originally developed army large-unit operations, technology generic also applies number domains, particularly critical situations requiring detailed planning within constrained period time. paper, focus particularly engineering tradeoffs design tool. experimental evaluation, reminiscent turing test, tool's performance compared favorably human planners.",4 "resource aware design deep convolutional-recurrent neural network speech recognition audio-visual sensor fusion. today's automatic speech recognition systems rely acoustic signals often perform well noisy conditions. performing multi-modal speech recognition - processing acoustic speech signals lip-reading video simultaneously - significantly enhances performance systems, especially noisy environments. work presents design audio-visual system automated speech recognition, taking memory computation requirements account. first, long-short-term-memory neural network acoustic speech recognition designed. second, convolutional neural networks used model lip-reading features. combined lstm network model temporal dependencies perform automatic lip-reading video. finally, acoustic-speech visual lip-reading networks combined process acoustic visual features simultaneously. attention mechanism ensures performance model noisy environments. system evaluated tcd-timit 'lipspeaker' dataset audio-visual phoneme recognition clean audio additive white noise snr 0db. achieves 75.70% 58.55% phoneme accuracy respectively, 14 percentage points better state-of-the-art noise levels.",4 "verbal chunk extraction french using limited resources. way extracting french verbal chunks, inflected infinitive, explored tested effective corpus. declarative morphological local grammar rules specifying chunks simple contextual structures used, relying limited lexical information simple heuristic/statistic properties obtained restricted corpora. specific goals, architecture formalism system, linguistic information relies obtained results effective corpus presented.",4 "kernelized deep convolutional neural network describing complex images. impressive capability capture visual content, deep convolutional neural networks (cnn) demon- strated promising performance various vision-based ap- plications, classification, recognition, objec- detection. however, due intrinsic structure design cnn, images complex content, achieves lim- ited capability invariance translation, rotation, re-sizing changes, strongly emphasized s- cenario content-based image retrieval. paper, address problem, proposed new kernelized deep convolutional neural network. first discuss motiva- tion experimental study demonstrate sensitivi- ty global cnn feature basic geometric trans- formations. then, propose represent visual content approximate invariance geometric trans- formations kernelized perspective. extract cnn features detected object-like patches aggregate patch-level cnn features form vectorial repre- sentation fisher vector model. effectiveness proposed algorithm demonstrated image search application three benchmark datasets.",4 "280 birds one stone: inducing multilingual taxonomies wikipedia using character-level classification. propose simple, yet effective, approach towards inducing multilingual taxonomies wikipedia. given english taxonomy, approach leverages interlanguage links wikipedia followed character-level classifiers induce high-precision, high-coverage taxonomies languages. experiments, demonstrate approach significantly outperforms state-of-the-art, heuristics-heavy approaches six languages. consequence work, release presumably largest accurate multilingual taxonomic resource spanning 280 languages.",4 "overcoming vanishing gradient problem plain recurrent networks. plain recurrent networks greatly suffer vanishing gradient problem gated neural networks (gnns) long-short term memory (lstm) gated recurrent unit (gru) deliver promising results many sequence learning tasks sophisticated network designs. paper shows address problem plain recurrent network analyzing gating mechanisms gnns. propose novel network called recurrent identity network (rin) allows plain recurrent network overcome vanishing gradient problem training deep models without use gates. compare model irnns lstms multiple sequence modeling benchmarks. rins demonstrate competitive performance converge faster tasks. notably, small rin models produce 12%--67% higher accuracy sequential permuted mnist datasets reach state-of-the-art performance babi question answering dataset.",4 "wordfence: text detection natural images border awareness. recent years, text recognition achieved remarkable success recognizing scanned document text. however, word recognition natural images still open problem, generally requires time consuming post-processing steps. present novel architecture individual word detection scene images based semantic segmentation. contributions twofold: concept wordfence, detects border areas surrounding individual word novel pixelwise weighted softmax loss function penalizes background emphasizes small text regions. wordfence ensures word detected individually, new loss function provides strong training signal text word border localization. proposed technique avoids intensive post-processing, producing end-to-end word detection system. achieve superior localization recall common benchmark datasets - 92% recall icdar11 icdar13 63% recall svt. furthermore, end-to-end word recognition system achieves state-of-the-art 86% f-score icdar13.",4 "comprehensive implementation conceptual spaces. highly influential framework conceptual spaces provides geometric way representing knowledge. instances represented points concepts represented regions (potentially) high-dimensional space. based recent formalization, present comprehensive implementation conceptual spaces framework capable representing concepts inter-domain correlations, also offers variety operations concepts.",4 "dynamic stochastic approximation multi-stage stochastic optimization. paper, consider multi-stage stochastic optimization problems convex objectives conic constraints stage. present new stochastic first-order method, namely dynamic stochastic approximation (dsa) algorithm, solving types stochastic optimization problems. show dsa achieve optimal ${\cal o}(1/\epsilon^4)$ rate convergence terms total number required scenarios applied three-stage stochastic optimization problem. show rate convergence improved ${\cal o}(1/\epsilon^2)$ objective function strongly convex. also discuss variants dsa solving general multi-stage stochastic optimization problems number stages $t > 3$. developed dsa algorithms need go scenario tree order compute $\epsilon$-solution multi-stage stochastic optimization problem. best knowledge, first time stochastic approximation type methods generalized multi-stage stochastic optimization $t \ge 3$.",12 "annotating object instances polygon-rnn. propose approach semi-automatic annotation object instances. current methods treat object segmentation pixel-labeling problem, cast polygon prediction task, mimicking current datasets annotated. particular, approach takes input image crop sequentially produces vertices polygon outlining object. allows human annotator interfere time correct vertex needed, producing accurate segmentation desired annotator. show approach speeds annotation process factor 4.7 across classes cityscapes, achieving 78.4% agreement iou original ground-truth, matching typical agreement human annotators. cars, speed-up factor 7.3 agreement 82.2%. show generalization capabilities approach unseen datasets.",4 "quantized convolutional neural networks mobile devices. recently, convolutional neural networks (cnn) demonstrated impressive performance various computer vision tasks. however, high performance hardware typically indispensable application cnn models due high computation complexity, prohibits extensions. paper, propose efficient framework, namely quantized cnn, simultaneously speed-up computation reduce storage memory overhead cnn models. filter kernels convolutional layers weighting matrices fully-connected layers quantized, aiming minimizing estimation error layer's response. extensive experiments ilsvrc-12 benchmark demonstrate 4~6x speed-up 15~20x compression merely one percentage loss classification accuracy. quantized cnn model, even mobile devices accurately classify images within one second.",4 "found good match: keep searching? - accuracy performance iris matching using 1-to-first search. iris recognition used many applications around world, enrollment sizes large one billion persons india's aadhaar program. large enrollment sizes require special optimizations order achieve fast database searches. one optimization used operational scenarios 1:first search. approach, instead scanning entire database, search terminated first sufficiently good match found. saves time, ignores potentially better matches may exist unexamined portion enrollments. least one prominent successful border-crossing program used approach nearly decade, order allow users fast ""token-free"" search. work investigates search accuracy 1:first compares traditional 1:n search. several different scenarios considered trying emulate real environments best possible: range enrollment sizes, closed- open-set configurations, two iris matchers, different permutations galleries. results confirm expected accuracy degradation using 1:first search, also allow us identify acceptable working parameters significant search time reduction achieved, maintaining accuracy similar 1:n search.",4 "top-down saliency detection driven visual classification. paper presents approach top-down saliency detection guided visual classification tasks. first learn compute visual saliency specific visual task accomplished, opposed state-of-the-art methods assess saliency merely bottom-up principles. afterwards, investigate extent visual saliency support visual classification nontrivial cases. achieve this, propose salclassnet, cnn framework consisting two networks jointly trained: a) first one computing top-down saliency maps input images, b) second one exploiting computed saliency maps visual classification. test approach, collected dataset eye-gaze maps, using tobii t60 eye tracker, asking several subjects look images stanford dogs dataset, objective distinguishing dog breeds. performance analysis dataset saliency bench-marking datasets, poet, showed salclassnet out-performs state-of-the-art saliency detectors, salnet salicon. finally, analyzed performance salclassnet fine-grained recognition task found generalizes better existing visual classifiers. achieved results, thus, demonstrate 1) conditioning saliency detectors object classes reaches state-of-the-art performance, 2) providing explicitly top-down saliency maps visual classifiers enhances classification accuracy.",4 "using distributional semantic vector space knowledge base reasoning uncertain conditions. inherent inflexibility incompleteness commonsense knowledge bases (kb) limited usefulness. describe system called displacer performing kb queries extended analogical capabilities word2vec distributional semantic vector space (dsvs). allows system answer queries information contained original kb form. performing analogous queries semantically related terms mapping answers back context original query using displacement vectors, able give approximate answers many questions which, posed kb alone, would return results. also show hand-curated knowledge kb used increase accuracy dsvs solving analogy problems. ways, kb dsvs make other's weaknesses.",4 "multi-issue negotiation deadlines. paper studies bilateral multi-issue negotiation self-interested autonomous agents. now, number different procedures used process; three main ones package deal procedure issues bundled discussed together, simultaneous procedure issues discussed simultaneously independently other, sequential procedure issues discussed one another. since yields different outcome, key problem decide one use circumstances. specifically, consider question model agents time constraints (in form deadlines discount factors) information uncertainty (in agents know opponents utility function). model, consider issues independent interdependent determine equilibria case procedure. doing, show package deal fact optimal procedure party. go show that, although package deal may computationally complex two procedures, generates pareto optimal outcomes (unlike two), similar earliest latest possible times agreement simultaneous procedure (which better sequential procedure), (like two procedures) generates unique outcome certain conditions (which define).",4 "rationally biased learning. human perception decision biases grounded form rationality? return camp hunting gathering. see grass moving. know probability snake grass. cross grass - risk bitten snake - make long, hence costly, detour? based storyline, consider rational decision maker maximizing expected discounted utility learning. show optimal behavior displays three biases: status quo, salience, overestimation small probabilities. biases product rational behavior.",4 coercive region-level registration multi-modal images. propose coercive approach simultaneously register segment multi-modal images share similar spatial structure. registration done region level facilitate data fusion avoiding need interpolation. algorithm performs alternating minimization objective function informed statistical models pixel values different modalities. hypothesis tests developed determine whether refine segmentations splitting regions. demonstrate approach significantly better performance state-of-the-art registration segmentation methods microscopy images.,4 "large scale language modeling automatic speech recognition. large language models proven quite beneficial variety automatic speech recognition tasks google. summarize results voice search youtube speech transcription tasks highlight impact one expect increasing amount training data, size language model estimated data. depending task, availability amount training data used, language model size amount work care put integrating lattice rescoring step observe reductions word error rate 6% 10% relative, systems wide range operating points 17% 52% word error rate.",4 "fast convnets using group-wise brain damage. revisit idea brain damage, i.e. pruning coefficients neural network, suggest brain damage modified used speedup convolutional layers. approach uses fact many efficient implementations reduce generalized convolutions matrix multiplications. suggested brain damage process prunes convolutional kernel tensor group-wise fashion adding group-sparsity regularization standard training process. group-wise pruning, convolutions reduced multiplications thinned dense matrices, leads speedup. comparison alexnet, method achieves competitive performance.",4 "spatial features multi-font/multi-size kannada numerals vowels recognition. paper presents multi-font/multi-size kannada numerals vowels recognition based spatial features. directional spatial features viz stroke density, stroke length number stokes image employed potential features characterize printed kannada numerals vowels. based features 1100 numerals 1400 vowels classified multi-class support vector machines (svm). proposed system achieves recognition accuracy 98.45% 90.64% numerals vowels respectively.",4 "quantized memory-augmented neural networks. memory-augmented neural networks (manns) refer class neural network models equipped external memory (such neural turing machines memory networks). neural networks outperform conventional recurrent neural networks (rnns) terms learning long-term dependency, allowing solve intriguing ai tasks would otherwise hard address. paper concerns problem quantizing manns. quantization known effective deploy deep models embedded systems limited resources. furthermore, quantization substantially reduce energy consumption inference procedure. benefits justify recent developments quantized multi layer perceptrons, convolutional networks, rnns. however, prior work reported successful quantization manns. in-depth analysis presented reveals various challenges appear quantization networks. without addressing properly, quantized manns would normally suffer excessive quantization error leads degraded performance. paper, identify memory addressing (specifically, content-based addressing) main reason performance degradation propose robust quantization method manns address challenge. experiments, achieved computation-energy gain 22x 8-bit fixed-point binary quantization compared floating-point implementation. measured babi dataset, resulting model, named quantized mann (q-mann), improved error rate 46% 30% 8-bit fixed-point binary quantization, respectively, compared mann quantized using conventional techniques.",4 "fusion hyperspectral panchromatic images using spectral uumixing results. hyperspectral imaging, due providing high spectral resolution images, one important tools remote sensing field. technological restrictions hyperspectral sensors limited spatial resolution. hand panchromatic image better spatial resolution. combining information together provide better understanding target scene. spectral unmixing mixed pixels hyperspectral images results spectral signature abundance fractions endmembers gives information location mixed pixel. paper used spectral unmixing results hyperspectral images segmentation results panchromatic image data fusion. proposed method applied simulated data using avris indian pines datasets. results show method effectively combine information hyperspectral panchromatic images.",4 "survey credit card fraud detection techniques: data technique oriented perspective. credit card plays important rule today's economy. becomes unavoidable part household, business global activities. although using credit cards provides enormous benefits used carefully responsibly,significant credit financial damages may caused fraudulent activities. many techniques proposed confront growth credit card fraud. however, techniques goal avoiding credit card fraud; one drawbacks, advantages characteristics. paper, investigating difficulties credit card fraud detection, seek review state art credit card fraud detection techniques, data sets evaluation criteria.the advantages disadvantages fraud detection methods enumerated compared.furthermore, classification mentioned techniques two main fraud detection approaches, namely, misuses (supervised) anomaly detection (unsupervised) presented. again, classification techniques proposed based capability process numerical categorical data sets. different data sets used literature described grouped real synthesized data effective common attributes extracted usage.moreover, evaluation employed criterions literature collected discussed.consequently, open issues credit card fraud detection explained guidelines new researchers.",4 "answer sequence learning neural networks answer selection community question answering. paper, answer selection problem community question answering (cqa) regarded answer sequence labeling task, novel approach proposed based recurrent architecture problem. approach applies convolution neural networks (cnns) learning joint representation question-answer pair firstly, uses joint representation input long short-term memory (lstm) learn answer sequence question labeling matching quality answer. experiments conducted semeval 2015 cqa dataset shows effectiveness approach.",4 "role word length semantic topology. topological argument presented concering structure semantic space, based negative correlation polysemy word length. resulting graph structure applied modeling free-recall experiments, resulting predictions comparative values recall probabilities. associative recall found favor longer words whereas sequential recall found favor shorter words. data peers experiments lohnas et al. (2015) healey kahana (2016) confirm predictons, correlation coefficients $r_{seq}= -0.17$ $r_{ass}= +0.17$. argument applied predicting global properties list recall, leads novel explanation word-length effect based optimization retrieval strategies.",16 "learning-based image reconstruction via parallel proximal algorithm. past decade, sparsity-driven regularization led advancement image reconstruction algorithms. traditionally, regularizers rely analytical models sparsity (e.g. total variation (tv)). however, recent methods increasingly centered around data-driven arguments inspired deep learning. letter, propose generalize tv regularization replacing l1-penalty alternative prior trainable. specifically, method learns prior via extending recently proposed fast parallel proximal algorithm (fppa) incorporate data-adaptive proximal operators. proposed framework require additional inner iterations evaluating proximal mappings corresponding learned prior. moreover, formalism ensures training reconstruction processes share algorithmic structure, making end-to-end implementation intuitive. example, demonstrate algorithm problem deconvolution fluorescence microscope.",4 "survey calibration methods optical see-through head-mounted displays. optical see-through head-mounted displays (ost hmds) major output medium augmented reality, seen significant growth popularity usage among general public due growing release consumer-oriented models, microsoft hololens. unlike virtual reality headsets, ost hmds inherently support addition computer-generated graphics directly light path user's eyes view physical world. augmented virtual reality systems, physical position ost hmd typically determined external embedded 6-degree-of-freedom tracking system. however, order properly render virtual objects, perceived spatially aligned physical environment, also necessary accurately measure position user's eyes within tracking system's coordinate frame. 20 years, researchers proposed various calibration methods determine needed eye position. however, date, comprehensive overview procedures requirements. hence, paper surveys field calibration methods ost hmds. specifically, provides insights fundamentals calibration techniques, presents overview manual automatic approaches, well evaluation methods metrics. finally, also identifies opportunities future research. % relative tracking coordinate system, and, hence, position 3d space.",4 "integrating human-provided information belief state representation using dynamic factorization. partially observed environments, useful human provide robot declarative information augments direct sensory observations. instance, given robot search-and-rescue mission, human operator might suggest locations interest. provide representation robot's internal knowledge supports efficient combination raw sensory information high-level declarative information presented formal language. computational efficiency achieved dynamically selecting appropriate factoring belief state, combining aspects belief correlated information separating not. strategy works open domains, set possible objects known advance, provides significant improvements inference time, leading efficient planning complex partially observable tasks. validate approach experimentally two open-domain planning problems: 2d discrete gridworld task 3d continuous cooking task.",4 "decision-making support system based know-how. research results described concerned with: - developing domain modeling method tools provide design implementation decision-making support systems computer integrated manufacturing; - building decision-making support system based know-how software environment. research funded nedo, japan.",4 computational geometry column 38. recent results curve reconstruction described.,4 "denet: scalable real-time object detection directed sparse sampling. define object detection imagery problem estimating large extremely sparse bounding box dependent probability distribution. subsequently identify sparse distribution estimation scheme, directed sparse sampling, employ single end-to-end cnn based detection model. methodology extends formalizes previous state-of-the-art detection models additional emphasis high evaluation rates reduced manual engineering. introduce two novelties, corner based region-of-interest estimator deconvolution based cnn model. resulting model scene adaptive, require manually defined reference bounding boxes produces highly competitive results mscoco, pascal voc 2007 pascal voc 2012 real-time evaluation rates. analysis suggests model performs particularly well finegrained object localization desirable. argue advantage stems significantly larger set available regions-of-interest relative methods. source-code available from: https://github.com/lachlants/denet",4 "minimax optimal algorithms unconstrained linear optimization. design analyze minimax-optimal algorithms online linear optimization games player's choice unconstrained. player strives minimize regret, difference loss loss post-hoc benchmark strategy. standard benchmark loss best strategy chosen bounded comparator set. comparison set adversary's gradients satisfy l_infinity bounds, give value game closed form prove approaches sqrt(2t/pi) -> infinity. interesting algorithms result consider soft constraints comparator, rather restricting bounded set. warmup, analyze game quadratic penalty. value game exactly t/2, value achieved perhaps simplest online algorithm all: unprojected gradient descent constant learning rate. derive minimax-optimal algorithm much softer penalty function. algorithm achieves good bounds standard notion regret comparator point, without needing specify comparator set advance. value game converges sqrt{e} ->infinity; give closed-form exact value function t. resulting algorithm natural unconstrained investment betting scenarios, since guarantees worst constant loss, allowing exponential reward ""easy"" adversary.",4 "rotational unit memory. concepts unitary evolution matrices associative memory boosted field recurrent neural networks (rnn) state-of-the-art performance variety sequential tasks. however, rnn still limited capacity manipulate long-term memory. bypass weakness successful applications rnn use external techniques attention mechanisms. paper propose novel rnn model unifies state-of-the-art approaches: rotational unit memory (rum). core rum rotational operation, is, naturally, unitary matrix, providing architectures power learn long-term dependencies overcoming vanishing exploding gradients problem. moreover, rotational unit also serves associative memory. evaluate model synthetic memorization, question answering language modeling tasks. rum learns copying memory task completely improves state-of-the-art result recall task. rum's performance babi question answering task comparable models attention mechanism. also improve state-of-the-art result 1.189 bits-per-character (bpc) loss character level penn treebank (ptb) task, signify applications rum real-world sequential data. universality construction, core rnn, establishes rum promising approach language modeling, speech recognition machine translation.",4 "deep convolutional neural networks predominant instrument recognition polyphonic music. identifying musical instruments polyphonic music recordings challenging important problem field music information retrieval. enables music search instrument, helps recognize musical genres, make music transcription easier accurate. paper, present convolutional neural network framework predominant instrument recognition real-world polyphonic music. train network fixed-length music excerpts single-labeled predominant instrument estimate arbitrary number predominant instruments audio signal variable length. obtain audio-excerpt-wise result, aggregate multiple outputs sliding windows test audio. so, investigated two different aggregation methods: one takes average instrument takes instrument-wise sum followed normalization. addition, conducted extensive experiments several important factors affect performance, including analysis window size, identification threshold, activation functions neural networks find optimal set parameters. using dataset 10k audio excerpts 11 instruments evaluation, found convolutional neural networks robust conventional methods exploit spectral features source separation support vector machines. experimental results showed proposed convolutional network architecture obtained f1 measure 0.602 micro 0.503 macro, respectively, achieving 19.6% 16.4% performance improvement compared state-of-the-art algorithms.",4 "topic modeling short texts incorporating word embeddings. inferring topics overwhelming amount short texts becomes critical challenging task many content analysis tasks, content charactering, user interest profiling, emerging topic detecting. existing methods probabilistic latent semantic analysis (plsa) latent dirichlet allocation (lda) cannot solve prob- lem well since limited word co-occurrence information available short texts. paper studies incorporate external word correlation knowledge short texts improve coherence topic modeling. based recent results word embeddings learn se- mantically representations words large corpus, introduce novel method, embedding-based topic model (etm), learn latent topics short texts. etm solves problem limited word co-occurrence information aggregating short texts long pseudo- texts, also utilizes markov random field regularized model gives correlated words better chance put topic. experiments real-world datasets validate effectiveness model comparing state-of-the-art models.",4 "modelling probability density markov sources. paper introduces objective function seeks minimise average total number bits required encode joint state layers markov source. type encoder may applied problem optimising bottom-up (recognition model) top-down (generative model) connections multilayer neural network, unifies several previous results optimisation multilayer neural networks.",4 "feature selection parallel technique remotely sensed imagery classification. remote sensing research focusing feature selection long attracted attention remote sensing community feature selection prerequisite image processing various applications. different feature selection methods proposed improve classification accuracy. vary basic search techniques clonal selections, various optimal criteria investigated. recently, methods using dependence-based measures attracted much attention due ability deal high dimensional datasets. however, methods based cramers v test, performance issues large datasets. paper, propose parallel approach improve performance. evaluate approach hyper-spectral high spatial resolution images compare proposed methods centralized version preliminary results. results promising.",4 "minimizing inter-subject variability fnirs based brain computer interfaces via multiple-kernel support vector learning. brain signal variability measurements obtained different subjects different sessions significantly deteriorates accuracy brain-computer interface (bci) systems. moreover variabilities, also known inter-subject inter-session variabilities, require lengthy calibration sessions bci system used. furthermore, calibration session repeated subject independently use bci due inter-session variability. study, present algorithm order minimize above-mentioned variabilities overcome time-consuming usually error-prone calibration time. algorithm based linear programming support-vector machines extensions multiple kernel learning framework. tackle inter-subject -session variability feature spaces classifiers. done incorporating subject- session-specific feature spaces much richer feature spaces set optimal decision boundaries. decision boundary represents subject- session specific spatio-temporal variabilities neural signals. consequently, single classifier multiple feature spaces generalize well new unseen test patterns even without calibration steps. demonstrate classifiers maintain good performances even presence large degree bci variability. present study analyzes bci variability related oxy-hemoglobin neural signals measured using functional near-infrared spectroscopy.",19 "sampling optimization space measures: langevin dynamics composite optimization problem. study sampling optimization space measures. focus gradient flow-based optimization langevin dynamics case study. investigate source bias unadjusted langevin algorithm (ula) discrete time, consider remove reduce bias. point difficulty heat flow exactly solvable, neither forward backward method implementable general, except gaussian data. propose symmetrized langevin algorithm (sla), smaller bias ula, price implementing proximal gradient step space. show sla fact consistent gaussian target measure, whereas ula not. also illustrate various algorithms explicitly gaussian target measure, including gradient descent, proximal gradient, forward-backward, show consistent.",12 "real-time 3d shape micro-details. motivated growing demand interactive environments, propose accurate real-time 3d shape reconstruction technique. provide reliable 3d reconstruction still challenging task dealing real-world applications, integrate several components including (i) photometric stereo (ps), (ii) perspective cook-torrance reflectance model enables ps deal broad range possible real-world object reflections, (iii) realistic lightening situation, (iv) recurrent optimization network (ron) finally (v) heuristic dijkstra gaussian mean curvature (dgmc) initialization approach. demonstrate potential benefits hybrid model providing 3d shape highly-detailed information micro-prints first time. real-world images taken mobile phone camera simple setup consumer-level equipment. addition, complementary synthetic experiments confirm beneficial properties novel method superiority state-of-the-art approaches.",4 "overdispersed black-box variational inference. introduce overdispersed black-box variational inference, method reduce variance monte carlo estimator gradient black-box variational inference. instead taking samples variational distribution, use importance sampling take samples overdispersed distribution exponential family variational approximation. approach general since readily applied exponential family distribution, typical choice variational approximation. run experiments two non-conjugate probabilistic models show method effectively reduces variance, overhead introduced computation proposal parameters importance weights negligible. find overdispersed importance sampling scheme provides lower variance black-box variational inference, even latter uses twice number samples. results faster convergence black-box inference procedure.",19 "capturing localized image artifacts cnn-based hyper-image representation. training deep cnns capture localized image artifacts relatively small dataset challenging task. enough images hand, one hope deep cnn characterizes localized artifacts entire data effect output. however, smaller datasets, deep cnns may overfit shallow ones find hard capture local artifacts. thus image-based small-data applications first train framework collection patches (instead entire image) better learn representation localized artifacts. output obtained averaging patch-level results. approach ignores spatial correlation among patches various patch locations affect output. also fails cases patches mainly contribute image label. combat scenarios, develop notion hyper-image representations. cnn two stages. first stage trained patches. second stage utilizes last layer representation developed first stage form hyper-image, used train second stage. show approach able develop better mapping image output. analyze additional properties approach show effectiveness one synthetic two real-world vision tasks - no-reference image quality estimation image tampering detection - performance improvement existing strong baselines.",4 approximate principal direction trees. introduce new spatial data structure high dimensional data called \emph{approximate principal direction tree} (apd tree) adapts intrinsic dimension data. algorithm ensures vector-quantization accuracy similar computationally-expensive pca trees similar time-complexity lower-accuracy rp trees. apd trees use small number power-method iterations find splitting planes recursively partitioning data. provide natural trade-off running-time accuracy achieved rp pca trees. theoretical results establish a) strong performance guarantees regardless convergence rate power-method b) $o(\log d)$ iterations suffice establish guarantee pca trees intrinsic dimension $d$. demonstrate trade-off efficacy data structure cpu gpu.,4 "geometrical interpretation shannon's entropy based born rule. paper analyze discrete probability distributions probabilities particular outcomes experiment (microstates) represented ratio natural numbers (in words, probabilities represented digital numbers finite representation length). introduce several results based recently proposed joystick probability selector, represents geometrical interpretation probability based born rule. terms generic space generic dimension discrete distribution, well as, effective dimension going introduced. shown simple geometric representation lead optimal code length coding sequence signals. then, give new, geometrical, interpretation shannon entropy discrete distribution. suggest shannon entropy represents logarithm effective dimension distribution. proposed geometrical interpretation shannon entropy used prove information inequalities elementary way.",4 "learning games rademacher observations losses. recently shown supervised learning popular logistic loss equivalent optimizing exponential loss sufficient statistics class: rademacher observations (rados). first show unexpected equivalence actually generalized example / rado losses, necessary sufficient conditions equivalence, exemplified four losses bear popular names various fields: exponential (boosting), mean-variance (finance), linear hinge (on-line learning), relu (deep learning), unhinged (statistics). second, show generalization unveils surprising new connection regularized learning, particular sufficient condition regularizing loss examples equivalent regularizing rados (with minkowski sums) equivalent rado loss. brings simple powerful rado-based learning algorithms sparsity-controlling regularization, exemplify boosting algorithm regularized exponential rado-loss, formally boosts four types regularization, including popular ridge lasso, recently coined slope --- obtain first proven boosting algorithm last regularization. first contribution equivalence rado example-based losses, omega-r.adaboost~appears efficient proxy boost regularized logistic loss examples using whichever four regularizers. experiments display regularization consistently improves performances rado-based learning, may challenge beat state art example-based learning even learning small sets rados. finally, connect regularization differential privacy, display tiny budgets afforded big domains beating (protected) example-based learning.",4 "approximate bayesian long short-term memory algorithm outlier detection. long short-term memory networks trained gradient descent back-propagation received great success various applications. however, point estimation weights networks prone over-fitting problems lacks important uncertainty information associated estimation. however, exact bayesian neural network methods intractable non-applicable real-world applications. study, propose approximate estimation weights uncertainty using ensemble kalman filter, easily scalable large number weights. furthermore, optimize covariance noise distribution ensemble update step using maximum likelihood estimation. assess proposed algorithm, apply outlier detection five real-world events retrieved twitter platform.",4 "analysis first prototype universal intelligence tests: evaluating comparing ai algorithms humans. today, available methods assess ai systems focused using empirical techniques measure performance algorithms specific tasks (e.g., playing chess, solving mazes land helicopter). however, methods appropriate want evaluate general intelligence ai and, even less, compare human intelligence. anynt project designed new method evaluation tries assess ai systems using well known computational notions problems general possible. new method serves assess general intelligence (which allows us learn solve new kind problem face) evaluate performance set specific tasks. method focuses measuring intelligence algorithms, also assess intelligent system (human beings, animals, ai, aliens?,...), letting us place results scale and, therefore, able compare them. new approach allow us (in future) evaluate compare kind intelligent system known even build/find, artificial biological. master thesis aims ensuring new method provides consistent results evaluating ai algorithms, done design implementation prototypes universal intelligence tests application different intelligent systems (ai algorithms humans beings). study analyze whether results obtained two different intelligent systems properly located scale propose changes refinements prototypes order to, future, able achieve truly universal intelligence test.",4 "sentiment analysis financial news headlines using training dataset augmentation. paper discusses approach taken uwaterloo team arrive solution fine-grained sentiment analysis problem posed task 5 semeval 2017. paper describes document vectorization sentiment score prediction techniques used, well design implementation decisions taken building system task. system uses text vectorization models, n-gram, tf-idf paragraph embeddings, coupled regression model variants predict sentiment scores. amongst methods examined, unigrams bigrams coupled simple linear regression obtained best baseline accuracy. paper also explores data augmentation methods supplement training dataset. system designed subtask 2 (news statements headlines).",4 "deep learning good steganalysis tool embedding key reused different images, even cover source-mismatch. since boss competition, 2010, steganalysis approaches use learning methodology involving two steps: feature extraction, rich models (rm), image representation, use ensemble classifier (ec) learning step. 2015, qian et al. shown use deep learning approach jointly learns computes features, promising steganalysis. paper, follow-up study qian et al., show that, due intrinsic joint minimization, results obtained convolutional neural network (cnn) fully connected neural network (fnn), well parameterized, surpass conventional use rm ec. first, numerous experiments conducted order find best "" shape "" cnn. second, experiments carried clairvoyant scenario order compare cnn fnn rm ec. results show 16% reduction classification error cnn fnn. third, experiments also performed cover-source mismatch setting. results show cnn fnn naturally robust mismatch problem. addition experiments, provide discussions internal mechanisms cnn, weave links previously stated ideas, order understand impressive results obtained.",4 "sparsity-based defense adversarial attacks linear classifiers. deep neural networks represent state art machine learning growing number fields, including vision, speech natural language processing. however, recent work raises important questions robustness architectures, showing possible induce classification errors tiny, almost imperceptible, perturbations. vulnerability ""adversarial attacks"", ""adversarial examples"", conjectured due excessive linearity deep networks. paper, study phenomenon setting linear classifier, show possible exploit sparsity natural data combat $\ell_{\infty}$-bounded adversarial perturbations. specifically, demonstrate efficacy sparsifying front end via ensemble averaged analysis, experimental results mnist handwritten digit database. best knowledge, first work show sparsity provides theoretically rigorous framework defense adversarial attacks.",19 "hierarchical internal representation spectral features deep convolutional networks trained eeg decoding. recently, increasing interest research interpretability machine learning models, example transform internally represent eeg signals brain-computer interface (bci) applications. help understand limits model may improved, addition possibly provide insight data itself. schirrmeister et al. (2017) recently reported promising results eeg decoding deep convolutional neural networks (convnets) trained end-to-end manner and, causal visualization approach, showed learn use spectral amplitude changes input. study, investigate convnets represent spectral features sequence intermediate stages network. show higher sensitivity eeg phase features earlier stages higher sensitivity eeg amplitude features later stages. intriguingly, observed specialization individual stages network classical eeg frequency bands alpha, beta, high gamma. furthermore, find first evidence particularly last convolutional layer, network learns detect complex oscillatory patterns beyond spectral phase amplitude, reminiscent representation complex visual features later layers convnets computer vision tasks. findings thus provide insights convnets hierarchically represent spectral eeg features intermediate layers suggest convnets exploit might help better understand compositional structure eeg time series.",4 "combinatorial multi-armed bandits filtered feedback. motivated problems search detection present solution combinatorial multi-armed bandit (cmab) problem heavy-tailed reward distributions new class feedback, filtered semibandit feedback. cmab problem agent pulls combination arms set $\{1,...,k\}$ round, generating random outcomes probability distributions associated arms receiving overall reward. semibandit feedback assumed random outcomes generated observed. filtered semibandit feedback allows outcomes observed sampled second distribution conditioned initial random outcomes. feedback mechanism valuable allows cmab methods applied sequential search detection problems combinatorial actions made, true rewards (number objects interest appearing round) observed, rather filtered reward (the number objects searcher successfully finds, must definition less number appear). present upper confidence bound type algorithm, robust-f-cucb, associated regret bound order $\mathcal{o}(\ln(n))$ balance exploration exploitation face filtering reward heavy tailed reward distributions.",4 "progressive representation adaptation weakly supervised object localization. address problem weakly supervised object localization image-level annotations available training object detectors. numerous methods proposed tackle problem mining object proposals. however, substantial amount noise object proposals causes ambiguities learning discriminative object models. approaches sensitive model initialization often converge undesirable local minimum solutions. paper, propose overcome drawbacks progressive representation adaptation two main steps: 1) classification adaptation 2) detection adaptation. classification adaptation, transfer pre-trained network multi-label classification task recognizing presence certain object image. classification adaptation step, network learns discriminative representations specific object categories interest. detection adaptation, mine class-specific object proposals exploiting two scoring strategies based adapted classification network. class-specific proposal mining helps remove substantial noise background clutter potential confusion similar objects. refine proposals using multiple instance learning segmentation cues. using refined object bounding boxes, fine-tune layer classification network obtain fully adapted detection network. present detailed experimental validation pascal voc ilsvrc datasets. experimental results demonstrate progressive representation adaptation algorithm performs favorably state-of-the-art methods.",4 "algorithms generating ordered solutions explicit and/or structures. present algorithms generating alternative solutions explicit acyclic and/or structures non-decreasing order cost. proposed algorithms use best first search technique report solutions using implicit representation ordered cost. paper, present two versions search algorithm -- (a) initial version best first search algorithm, asg, may present one solution generating ordered solutions, (b) another version, lasg, avoids construction duplicate solutions. actual solutions reconstructed quickly implicit compact representation used. applied methods test domains, synthetic others based well known problems including search space 5-peg tower hanoi problem, matrix-chain multiplication problem problem finding secondary structure rna. experimental results show efficacy proposed algorithms existing approach. proposed algorithms potential use various domains ranging knowledge based frameworks service composition, and/or structure widely used representing problems.",4 "training quantized nets: deeper understanding. currently, deep neural networks deployed low-power portable devices first training full-precision model using powerful hardware, deriving corresponding low-precision model efficient inference systems. however, training models directly coarsely quantized weights key step towards learning embedded platforms limited computing resources, memory capacity, power consumption. numerous recent publications studied methods training quantized networks, studies mostly empirical. work, investigate training methods quantized neural networks theoretical viewpoint. first explore accuracy guarantees training methods convexity assumptions. look behavior algorithms non-convex problems, show training algorithms exploit high-precision representations important greedy search phase purely quantized training methods lack, explains difficulty training using low-precision arithmetic.",4 "adaptive strategy superpixel-based region-growing image segmentation. work presents region-growing image segmentation approach based superpixel decomposition. initial contour-constrained over-segmentation input image, image segmentation achieved iteratively merging similar superpixels regions. approach raises two key issues: (1) compute similarity superpixels order perform accurate merging (2) order superpixels must merged together. perspective, firstly introduce robust adaptive multi-scale superpixel similarity region comparisons made content common border level. secondly, propose global merging strategy efficiently guide region merging process. strategy uses adpative merging criterion ensure best region aggregations given highest priorities. allows reach final segmentation consistent regions strong boundary adherence. perform experiments bsds500 image dataset highlight extent method compares favorably well-known image segmentation algorithms. obtained results demonstrate promising potential proposed approach.",4 "counterexample guided abstraction refinement algorithm propositional circumscription. circumscription representative example nonmonotonic reasoning inference technique. circumscription often studied first order theories, propositional version also subject extensive research, shown equivalent extended closed world assumption (ecwa). moreover, entailment propositional circumscription well-known example decision problem second level polynomial hierarchy. paper proposes new boolean satisfiability (sat)-based algorithm entailment propositional circumscription explores relationship propositional circumscription minimal models. new algorithm inspired ideas commonly used sat-based model checking, namely counterexample guided abstraction refinement. addition, new algorithm refined compute theory closure generalized close world assumption (gcwa). experimental results show new algorithm solve problem instances solutions unable solve.",4 "improving agreement disagreement identification online discussions socially-tuned sentiment lexicon. study problem agreement disagreement detection online discussions. isotonic conditional random fields (isotonic crf) based sequential model proposed make predictions sentence- segment-level. automatically construct socially-tuned lexicon bootstrapped existing general-purpose sentiment lexicons improve performance. evaluate agreement disagreement tagging model two disparate online discussion corpora -- wikipedia talk pages online debates. model shown outperform state-of-the-art approaches datasets. example, isotonic crf model achieves f1 scores 0.74 0.67 agreement disagreement detection, linear chain crf obtains 0.58 0.56 discussions wikipedia talk pages.",4 "deep learning algorithm one-step contour aware nuclei segmentation histopathological images. paper addresses task nuclei segmentation high-resolution histopathological images. propose auto- matic end-to-end deep neural network algorithm segmenta- tion individual nuclei. nucleus-boundary model introduced predict nuclei boundaries simultaneously using fully convolutional neural network. given color normalized image, model directly outputs estimated nuclei map boundary map. simple, fast parameter-free post-processing procedure performed estimated nuclei map produce final segmented nuclei. overlapped patch extraction assembling method also designed seamless prediction nuclei large whole-slide images. also show effectiveness data augmentation methods nuclei segmentation task. experiments showed method outperforms prior state-of-the- art methods. moreover, efficient one 1000x1000 image segmented less 5 seconds. makes possible precisely segment whole-slide image acceptable time",4 "bootstrapping lexical choice via multiple-sequence alignment. important component generation system mapping dictionary, lexicon elementary semantic expressions corresponding natural language realizations. typically, labor-intensive knowledge-based methods used construct dictionary. instead propose acquire automatically via novel multiple-pass algorithm employing multiple-sequence alignment, technique commonly used bioinformatics. crucially, method leverages latent information contained multi-parallel corpora -- datasets supply several verbalizations corresponding semantics rather one. used techniques generate natural language versions computer-generated mathematical proofs, good results per-component overall-output basis. example, evaluations involving dozen human judges, system produced output whose readability faithfulness semantic input rivaled traditional generation system.",4 "use lose it: selective memory forgetting perpetual learning machine. recent article described new type deep neural network - perpetual learning machine (plm) - capable learning 'on fly' like brain existing state perpetual stochastic gradient descent (psgd). here, simulating process practice, demonstrate selective memory selective forgetting introduce statistical recall biases psgd. frequently recalled memories remembered, whilst memories recalled rarely forgotten. results 'use lose it' stimulus driven memory process similar human memory.",4 "max-margin nonparametric latent feature models link prediction. present max-margin nonparametric latent feature model, unites ideas max-margin learning bayesian nonparametrics discover discriminative latent features link prediction automatically infer unknown latent social dimension. minimizing hinge-loss using linear expectation operator, perform posterior inference efficiently without dealing highly nonlinear link likelihood function; using fully-bayesian formulation, avoid tuning regularization constants. experimental results real datasets appear demonstrate benefits inherited max-margin learning fully-bayesian nonparametric inference.",4 "kernel risk-sensitive loss: definition, properties application robust adaptive filtering. nonlinear similarity measures defined kernel space, correntropy, extract higher-order statistics data offer potentially significant performance improvement linear counterparts especially non-gaussian signal processing machine learning. work, propose new similarity measure kernel space, called kernel risk-sensitive loss (krsl), provide important properties. apply krsl adaptive filtering investigate robustness, develop mkrsl algorithm analyze mean square convergence performance. compared correntropy, krsl offer efficient performance surface, thereby enabling gradient based method achieve faster convergence speed higher accuracy still maintaining robustness outliers. theoretical analysis results superior performance new algorithm confirmed simulation.",19 "challenge multi-camera tracking. multi-camera tracking quite different single camera tracking, faces new technology system architecture challenges. analyzing corresponding characteristics disadvantages existing algorithms, problems multi-camera tracking summarized new directions future work also generalized.",4 "elu network total variation image denoising. paper, propose novel convolutional neural network (cnn) image denoising, uses exponential linear unit (elu) activation function. investigate suitability analyzing elu's connection trainable nonlinear reaction diffusion model (tnrd) residual denoising. hand, batch normalization (bn) indispensable residual denoising convergence purpose. however, direct stacking bn elu degrades performance cnn. mitigate issue, design innovative combination activation layer normalization layer exploit leverage elu network, discuss corresponding rationale. moreover, inspired fact minimizing total variation (tv) applied image denoising, propose tv regularized l2 loss evaluate training effect iterations. finally, conduct extensive experiments, showing model outperforms recent popular approaches gaussian denoising specific randomized noise levels gray color images.",4 "exploiting feature class relationships video categorization regularized deep neural networks. paper, study challenging problem categorizing videos according high-level semantics existence particular human action complex event. although extensive efforts devoted recent years, existing works combined multiple video features using simple fusion strategies neglected utilization inter-class semantic relationships. paper proposes novel unified framework jointly exploits feature relationships class relationships improved categorization performance. specifically, two types relationships estimated utilized rigorously imposing regularizations learning process deep neural network (dnn). regularized dnn (rdnn) efficiently realized using gpu-based implementation affordable training cost. arming dnn better capability harnessing feature class relationships, proposed rdnn suitable modeling video semantics. extensive experimental evaluations, show rdnn produces superior performance several state-of-the-art approaches. well-known hollywood2 columbia consumer video benchmarks, obtain competitive results: 66.9\% 73.5\% respectively terms mean average precision. addition, substantially evaluate rdnn stimulate future research large scale video categorization, collect release new benchmark dataset, called fcvid, contains 91,223 internet videos 239 manually annotated categories.",4 "clustering multidimensional data pso based algorithm. data clustering recognized data analysis method data mining whereas k-means well known partitional clustering method, possessing pleasant features. observed that, k-means partitional clustering techniques suffer several limitations initial cluster centre selection, preknowledge number clusters, dead unit problem, multiple cluster membership premature convergence local optima. several optimization methods proposed literature order solve clustering limitations, swarm intelligence (si) achieved remarkable position concerned area. particle swarm optimization (pso) popular si technique one favorite areas researchers. paper, present brief overview pso applicability variants solve clustering challenges. also, propose advanced pso algorithm named subtractive clustering based boundary restricted adaptive particle swarm optimization (sc-br-apso) algorithm clustering multidimensional data. comparison purpose, studied analyzed various algorithms k-means, pso, k-means-pso, hybrid subtractive + pso, brapso, proposed algorithm nine different datasets. motivation behind proposing sc-br-apso algorithm deal multidimensional data clustering, minimum error rate maximum convergence rate.",4 "ian: individual aggregation network person search. person search real-world scenarios new challenging computer version task many meaningful applications. challenge task mainly comes from: (1) unavailable bounding boxes pedestrians model needs search person whole gallery images; (2) huge variance visual appearance particular person owing varying poses, lighting conditions, occlusions. address two critical issues modern person search applications, propose novel individual aggregation network (ian) accurately localize persons learning minimize intra-person feature variations. ian built upon state-of-the-art object detection framework, i.e., faster r-cnn, high-quality region proposals pedestrians produced online manner. addition, relieve negative effect caused varying visual appearances individual, ian introduces novel center loss increase intra-class compactness feature representations. engaged center loss encourages persons identity similar feature characteristics. extensive experimental results two benchmarks, i.e., cuhk-sysu prw, well demonstrate superiority proposed model. particular, ian achieves 77.23% map 80.45% top-1 accuracy cuhk-sysu, outperform state-of-the-art 1.7% 1.85%, respectively.",4 "value alignment, fair play, rights service robots. ethics safety research artificial intelligence increasingly framed terms ""alignment"" human values interests. argue turing's call ""fair play machines"" early often overlooked contribution alignment literature. turing's appeal fair play suggests need correct human behavior accommodate machines, surprising inversion value alignment treated today. reflections ""fair play"" motivate novel interpretation turing's notorious ""imitation game"" condition intelligence instead value alignment: machine demonstrates minimal degree alignment (with norms conversation, instance) go undetected interrogated human. carefully distinguish interpretation moral turing test, motivated principle fair play, instead depends imitation human moral behavior. finally, consider framework fair play used situate debate robot rights within alignment literature. argue extending rights service robots operating public spaces ""fair"" precisely sense encourages alignment interests humans machines.",4 "online unsupervised feature learning visual tracking. feature encoding respect over-complete dictionary learned unsupervised methods, followed spatial pyramid pooling, linear classification, exhibited powerful strength various vision applications. propose use feature learning pipeline visual tracking. tracking implemented using tracking-by-detection resulted framework simple yet effective. first, online dictionary learning used build dictionary, captures appearance changes tracking target well background changes. given test image window, extract local image patches local patch encoded respect dictionary. encoded features pooled spatial pyramid form aggregated feature vector. finally, simple linear classifier trained features. experiments show proposed powerful---albeit simple---tracker, outperforms state-of-the-art tracking methods tested. moreover, evaluate performance different dictionary learning feature encoding methods proposed tracking framework, analyse impact component tracking scenario. also demonstrate flexibility feature learning plugging hare et al.'s tracking method. outcome is, knowledge, best tracker ever reported, facilitates advantages feature learning structured output prediction.",4 "pros cons gan evaluation measures. generative models, particular generative adverserial networks (gans), received lot attention recently. number gan variants proposed utilized many applications. despite large strides terms theoretical progress, evaluating comparing gans remains daunting task. several measures introduced, yet, consensus measure best captures strengths limitations models used fair model comparison. areas computer vision machine learning, critical settle one good measures steer progress field. paper, review critically discuss 19 quantitative 4 qualitative measures evaluating generative models particular emphasis gan-derived models.",4 "discrete network dynamics. part 1: operator theory. operator algebra implementation markov chain monte carlo algorithms simulating markov random fields proposed. allows dynamics networks whose nodes discrete state spaces specified action update operator composed creation annihilation operators. formulation discrete network dynamics properties similar quantum field theory bosons, allows reuse many conceptual theoretical structures qft. equilibrium behaviour one generalised mrfs adaptive cluster expansion network (acenet) shown equivalent, provides way unifying two theories.",4 "decision uncertainty diagnosis. paper describes incorporation uncertainty diagnostic reasoning based set covering model reggia et. al. extended artificial intelligence dichotomy deep compiled (shallow, surface) knowledge based diagnosis may viewed generic form compiled end spectrum. major undercurrent advocating need strong underlying model integrated set support tools carrying model order deal uncertainty.",4 "neural sequence model training via $α$-divergence minimization. propose new neural sequence model training method objective function defined $\alpha$-divergence. demonstrate objective function generalizes maximum-likelihood (ml)-based reinforcement learning (rl)-based objective functions special cases (i.e., ml corresponds $\alpha \to 0$ rl $\alpha \to1$). also show gradient objective function considered mixture ml- rl-based objective gradients. experimental results machine translation task show minimizing objective function $\alpha > 0$ outperforms $\alpha \to 0$, corresponds ml-based methods.",19 "topic stability noisy sources. topic modelling techniques lda recently applied speech transcripts ocr output. corpora may contain noisy erroneous texts may undermine topic stability. therefore, important know well topic modelling algorithm perform applied noisy data. paper show different types textual noise diverse effects stability different topic models. observations, propose guidelines text corpus generation, focus automatic speech transcription. also suggest topic model selection methods noisy corpora.",4 "target tracking real time surveillance cameras videos. security concerns kept increasing, important everyone keep property safe thefts destruction. need surveillance techniques also increasing. system developed detect motion video. system developed real time applications using techniques background subtraction frame differencing. system, motion detected webcam real time video. background subtraction frames differencing method used detect moving target. background subtraction method, current frame subtracted referenced frame threshold applied. difference greater threshold considered pixel moving object, otherwise considered background pixel. similarly, two frames difference method takes difference two continuous frames. resultant difference frame thresholded amount difference pixels calculated.",4 "unsupervised spike sorting based discriminative subspace learning. spike sorting fundamental preprocessing step many neuroscience studies rely analysis spike trains. paper, present two unsupervised spike sorting algorithms based discriminative subspace learning. first algorithm simultaneously learns discriminative feature subspace performs clustering. uses histogram features discriminative projection detect number neurons. second algorithm performs hierarchical divisive clustering learns discriminative 1-dimensional subspace clustering level hierarchy achieving almost unimodal distribution subspace. algorithms tested synthetic in-vivo data, compared two widely used spike sorting methods. comparative results demonstrate spike sorting methods achieve substantially higher accuracy lower dimensional feature space, highly robust noise. moreover, provide significantly better cluster separability learned subspace subspace obtained principal component analysis wavelet transform.",4 "clustering-based quantisation pde-based image compression. finding optimal data inpainting key problem context partial differential equation based image compression. data yields accurate reconstruction real-valued. thus, quantisation models mandatory allow efficient encoding. also understood challenging data clustering problems. although clustering approaches well suited kind compression codecs, works actually consider them. pixel global impact reconstruction optimal data locations strongly correlated corresponding colour values. facts make hard predict feature works best. paper discuss quantisation strategies based popular methods k-means. lead central question kind feature vectors best suited image compression. end consider choices pixel values, histogram colour map. findings show number colours reduced significantly without impacting reconstruction quality. surprisingly, benefits directly translate good image compression performance. gains compression ratio lost due increased storage costs. suggests integral evaluate clustering both, reconstruction error final file size.",4 "stochastic variance reduction methods policy evaluation. policy evaluation crucial step many reinforcement-learning procedures, estimates value function predicts states' long-term value given policy. paper, focus policy evaluation linear function approximation fixed dataset. first transform empirical policy evaluation problem (quadratic) convex-concave saddle point problem, present primal-dual batch gradient method, well two stochastic variance reduction methods solving problem. algorithms scale linearly sample size feature dimension. moreover, achieve linear convergence even saddle-point problem strong concavity dual variables strong convexity primal variables. numerical experiments benchmark problems demonstrate effectiveness methods.",4 "mixture counting cnns: adaptive integration cnns specialized specific appearance crowd counting. paper proposes crowd counting method. crowd counting difficult large appearance changes target caused density scale changes. conventional crowd counting methods generally utilize one predictor (e,g., regression multi-class classifier). however, one predictor count targets large appearance changes well. paper, propose predict number targets using multiple cnns specialized specific appearance, cnns adaptively selected according appearance test image. integrating selected cnns, proposed method robustness large appearance changes. experiments, confirm proposed method count crowd lower counting error cnn integration cnns fixed weights. moreover, confirm predictor automatically specialized specific appearance.",4 "improved ant colony system sequential ordering problem. rare performance one metaheuristic algorithm improved incorporating ideas taken another. article present simulated annealing (sa) used improve efficiency ant colony system (acs) enhanced acs solving sequential ordering problem (sop). moreover, show ideas applied improve convergence dedicated local search, i.e. sop-3-exchange algorithm. statistical analysis proposed algorithms terms finding suitable parameter values quality generated solutions presented based series computational experiments conducted sop instances well-known tsplib soplib2006 repositories. proposed acs-sa eacs-sa algorithms often generate solutions better quality acs eacs, respectively. moreover, eacs-sa algorithm combined proposed sop-3-exchange-sa local search able find 10 new best solutions sop instances soplib2006 repository, thus improving state-of-the-art results known literature. overall, best known improved solutions found 41 48 cases.",4 "metric-free natural gradient joint-training boltzmann machines. paper introduces metric-free natural gradient (mfng) algorithm training boltzmann machines. similar spirit hessian-free method martens [8], algorithm belongs family truncated newton methods exploits efficient matrix-vector product avoid explicitely storing natural gradient metric $l$. metric shown expected second derivative log-partition function (under model distribution), equivalently, variance vector partial derivatives energy function. evaluate method task joint-training 3-layer deep boltzmann machine show mfng indeed faster per-epoch convergence compared stochastic maximum likelihood centering, though wall-clock performance currently competitive.",4 "tiny descriptors image retrieval unsupervised triplet hashing. typical image retrieval pipeline starts comparison global descriptors large database find short list candidate matches. good image descriptor key retrieval pipeline reconcile two contradictory requirements: providing recall rates high possible compact possible fast matching. following recent successes deep convolutional neural networks (dcnn) large scale image classification, descriptors extracted dcnns increasingly used place traditional hand crafted descriptors fisher vectors (fv) better retrieval performances. nevertheless, dimensionality typical dcnn descriptor --extracted either visual feature pyramid fully-connected layers-- remains quite high several thousands scalar values. paper, propose unsupervised triplet hashing (uth), fully unsupervised method compute extremely compact binary hashes --in 32-256 bits range-- high-dimensional global descriptors. uth consists two successive deep learning steps. first, stacked restricted boltzmann machines (srbm), type unsupervised deep neural nets, used learn binary embedding functions able bring descriptor size desired bitrate. srbms typically able ensure high compression rate expense loosing desirable metric properties original dcnn descriptor space. then, triplet networks, rank learning scheme based weight sharing nets used fine-tune binary embedding functions retain much possible useful metric properties original space. thorough empirical evaluation conducted multiple publicly available dataset using dcnn descriptors shows method able significantly outperform state-of-the-art unsupervised schemes target bit range.",4 "large-scale optimization algorithms sparse conditional gaussian graphical models. paper addresses problem scalable optimization l1-regularized conditional gaussian graphical models. conditional gaussian graphical models generalize well-known gaussian graphical models conditional distributions model output network influenced conditioning input variables. highly scalable optimization methods exist sparse gaussian graphical model estimation, state-of-the-art methods conditional gaussian graphical models efficient enough importantly, fail due memory constraints large problems. paper, propose new optimization procedure based newton method efficiently iterates two sub-problems, leading drastic improvement computation time compared previous methods. extend method scale large problems memory constraints, using block coordinate descent limit memory usage achieving fast convergence. using synthetic genomic data, show methods solve one million dimensional problems high accuracy little day single machine.",19 "adversarial perturbations deep neural networks malware classification. deep neural networks, like many machine learning models, recently shown lack robustness adversarially crafted inputs. inputs derived regular inputs minor yet carefully selected perturbations deceive machine learning models desired misclassifications. existing work emerging field largely specific domain image classification, since high-entropy images conveniently manipulated without changing images' overall visual appearance. yet, remains unclear attacks translate security-sensitive applications malware detection - may pose significant challenges sample generation arguably grave consequences failure. paper, show construct highly-effective adversarial sample crafting attacks neural networks used malware classifiers. application domain malware classification introduces additional constraints adversarial sample crafting problem compared computer vision domain: (i) continuous, differentiable input domains replaced discrete, often binary inputs; (ii) loose condition leaving visual appearance unchanged replaced requiring equivalent functional behavior. demonstrate feasibility attacks many different instances malware classifiers trained using drebin android malware data set. furthermore evaluate extent potential defensive mechanisms adversarial crafting leveraged setting malware classification. feature reduction prove positive impact, distillation re-training adversarially crafted samples show promising results.",4 "multiscale edge detection parametric shape modeling boundary delineation optoacoustic images. article, present novel scheme segmenting image boundary (with background) optoacoustic small animal vivo imaging systems. method utilizes multiscale edge detection algorithm generate binary edge map. scale dependent morphological operation employed clean spurious edges. thereafter, ellipse fitted edge map constrained parametric transformations iterative goodness fit calculations. method delimits tissue edges curve fitting model, shown high levels accuracy. thus, method enables segmentation optoacoutic images minimal human intervention, eliminating need scale selection multiscale processing seed point determination contour mapping.",15 "control crack propagate along specified path feasibly?. controllable crack propagation (ccp) strategy suggested. well known crack always leads failure crossing critical domain engineering structure. therefore, ccp method proposed control crack propagate along specified path, away critical domain. complete strategy, two optimization methods engaged. firstly, back propagation neural network (bpnn) assisted particle swarm optimization (pso) suggested. method, improve efficiency ccp, bpnn used build metamodel instead forward evaluation. secondly, popular pso used. considering optimization iteration time consuming process, efficient reanalysis based extended finite element methods (x-fem) used substitute complete x-fem solver calculate crack propagation path. moreover, adaptive subdomain partition strategy suggested improve fitting accuracy real crack specified paths. several typical numerical examples demonstrate optimization methods carry ccp. selection determined tradeoff efficiency accuracy.",4 "swarm intelligence based algorithms: critical analysis. many optimization algorithms developed drawing inspiration swarm intelligence (si). si-based algorithms advantages traditional algorithms. paper, carry critical analysis si-based algorithms analyzing ways mimic evolutionary operators. also analyze ways achieving exploration exploitation algorithms using mutation, crossover selection. addition, also look algorithms using dynamic systems, self-organization markov chain framework. finally, provide discussions topics research.",12 "recognizing static signs brazilian sign language: comparing large-margin decision directed acyclic graphs, voting support vector machines artificial neural networks. paper, explore detail experiments high-dimensionality, multi-class image classification problem often found automatic recognition sign languages. here, efforts directed towards comparing characteristics, advantages drawbacks creating training support vector machines disposed directed acyclic graph artificial neural networks classify signs brazilian sign language (libras). explore different heuristics, hyperparameters multi-class decision schemes affect performance, efficiency ease use classifier. provide hyperparameter surface maps capturing accuracy efficiency, comparisons ddags 1-vs-1 svms, effects heuristics training anns resilient backpropagation. report statistically significant results using cohen's kappa statistic contingency tables.",4 "introduction ross: new representational scheme. ross (""representation, ontology, structure, star"") introduced new method knowledge representation emphasizes representational constructs physical structure. ross representational scheme includes language called ""star"" specification ontology classes. ross method also includes formal scheme called ""instance model"". instance models used area natural language meaning representation represent situations. paper provides rationale philosophical background ross method.",4 "context-aware captions context-agnostic supervision. introduce inference technique produce discriminative context-aware image captions (captions describe differences images visual concepts) using generic context-agnostic training data (captions describe concept image isolation). example, given images captions ""siamese cat"" ""tiger cat"", generate language describes ""siamese cat"" way distinguishes ""tiger cat"". key novelty show joint inference language model context-agnostic listener distinguishes closely-related concepts. first apply technique justification task, namely describe image contains particular fine-grained category opposed another closely-related category cub-200-2011 dataset. study discriminative image captioning generate language uniquely refers one two semantically-similar images coco dataset. evaluations discriminative ground truth justification human studies discriminative image captioning reveal approach outperforms baseline generative speaker-listener approaches discrimination.",4 "cubic range error model stereo vision illuminators. use low-cost depth sensors, stereo camera setup illuminators, particular interest numerous applications ranging robotics transportation mixed augmented reality. ability quantify noise crucial applications, e.g., sensor used map generation develop sensor scheduling policy multi-sensor setup. range error models provide uncertainty estimates help weigh data correctly instances range measurements taken different vantage points different sensors. weighing important fuse range data map meaningful way, i.e., high confidence data relied heavily. model derived work. show range error stereo systems integrated illuminators cubic validate proposed model experimentally off-the-shelf structured light stereo system. experiments confirm validity model simplify application type sensor robotics. proposed error model relevant stereo system low ambient light main light source located camera system. among others, case structured light stereo systems night stereo systems headlights. work, propose range error cubic range stereo systems integrated illuminators. experimental validation off-the-shelf structured light stereo system shows exponent 2.4 2.6. deviation attributed model considering shot noise.",4 "dynamic island model based spectral clustering genetic algorithm. maintain relative high diversity important avoid premature convergence population-based optimization methods. island model widely considered major approach achieve flexibility high efficiency. model maintains group sub-populations different islands allows sub-populations interact via predefined migration policies. however, current island model drawbacks. one certain number generations, different islands may retain quite similar, converged sub-populations thereby losing diversity decreasing efficiency. another drawback determining number islands maintain also challenging. meanwhile initializing many sub-populations increases randomness island model. address issues, proposed dynamic island model~(dim-sp) force island maintain different sub-populations, control number islands dynamically starts one sub-population. proposed island model outperforms three state-of-the-art island models three baseline optimization problems including job shop scheduler problem, travelling salesmen problem quadratic multiple knapsack problem.",4 "imaging time-series improve classification imputation. inspired recent successes deep learning computer vision, propose novel framework encoding time series different types images, namely, gramian angular summation/difference fields (gasf/gadf) markov transition fields (mtf). enables use techniques computer vision time series classification imputation. used tiled convolutional neural networks (tiled cnns) 20 standard datasets learn high-level features individual compound gasf-gadf-mtf images. approaches achieve highly competitive results compared nine current best time series classification approaches. inspired bijection property gasf 0/1 rescaled data, train denoised auto-encoders (da) gasf images four standard one synthesized compound dataset. imputation mse test data reduced 12.18%-48.02% compared using raw data. analysis features weights learned via tiled cnns das explains approaches work.",4 "partial functional correspondence. paper, propose method computing partial functional correspondence non-rigid shapes. use perturbation analysis show removal shape parts changes laplace-beltrami eigenfunctions, exploit prior spectral representation correspondence. corresponding parts optimization variables problem used weight functional correspondence; looking largest regular (in mumford-shah sense) parts minimize correspondence distortion. show approach cope challenging correspondence settings.",4 "learning diversify via weighted kernels classifier ensemble. classifier ensemble generally combine diverse component classifiers. however, difficult give definitive connection diversity measure ensemble accuracy. given list available component classifiers, adaptively diversely ensemble classifiers becomes big challenge literature. paper, argue diversity, direct diversity samples adaptive diversity data, highly correlated ensemble accuracy, propose novel technology classifier ensemble, learning diversify, learns adaptively combine classifiers considering accuracy diversity. specifically, approach, learning diversify via weighted kernels (l2dwk), performs classifier combination optimizing direct simple criterion: maximizing ensemble accuracy adaptive diversity simultaneously minimizing convex loss function. given measure formulation, diversity calculated weighted kernels (i.e., diversity measured component classifiers' outputs kernelled weighted), kernel weights automatically learned. minimize loss function estimating kernel weights conjunction classifier weights, propose self-training algorithm conducting convex optimization procedure iteratively. extensive experiments variety 32 uci classification benchmark datasets show proposed approach consistently outperforms state-of-the-art ensembles bagging, adaboost, random forests, gasen, regularized selective ensemble, ensemble pruning via semi-definite programming.",4 "learning remember translation history continuous cache. existing neural machine translation (nmt) models generally translate sentences isolation, missing opportunity take advantage document-level information. work, propose augment nmt models light-weight cache-like memory network, stores recent hidden representations translation history. probability distribution generated words updated online depending translation history retrieved memory, endowing nmt models capability dynamically adapt time. experiments multiple domains different topics styles show effectiveness proposed approach negligible impact computational cost.",4 "learning compact recurrent neural networks block-term tensor decomposition. recurrent neural networks (rnns) powerful sequence modeling tools. however, dealing high dimensional inputs, training rnns becomes computational expensive due large number model parameters. hinders rnns solving many important computer vision tasks, action recognition videos image captioning. overcome problem, propose compact flexible structure, namely block-term tensor decomposition, greatly reduces parameters rnns improves training efficiency. compared alternative low-rank approximations, tensor-train rnn (tt-rnn), method, block-term rnn (bt-rnn), concise (when using rank), also able attain better approximation original rnns much fewer parameters. three challenging tasks, including action recognition videos, image captioning image generation, bt-rnn outperforms tt-rnn standard rnn terms prediction accuracy convergence rate. specifically, bt-lstm utilizes 17,388 times fewer parameters standard lstm achieve accuracy improvement 15.6\% action recognition task ucf11 dataset.",4 "neural algorithm artistic style. fine art, especially painting, humans mastered skill create unique visual experiences composing complex interplay content style image. thus far algorithmic basis process unknown exists artificial system similar capabilities. however, key areas visual perception object face recognition near-human performance recently demonstrated class biologically inspired vision models called deep neural networks. introduce artificial system based deep neural network creates artistic images high perceptual quality. system uses neural representations separate recombine content style arbitrary images, providing neural algorithm creation artistic images. moreover, light striking similarities performance-optimised artificial neural networks biological vision, work offers path forward algorithmic understanding humans create perceive artistic imagery.",4 "improving term extraction terminological resources. studies different term extractors corpus biomedical domain revealed decreasing performances applied highly technical texts. difficulty impossibility customising new domains additional limitation. paper, propose use external terminologies influence generic linguistic data order augment quality extraction. tool implemented exploits testified terms different steps process: chunking, parsing extraction term candidates. experiments reported show that, using method, term candidates acquired higher level reliability. describe extraction process involving endogenous disambiguation implemented term extractor yatea.",4 "persona-based neural conversation model. present persona-based models handling issue speaker consistency neural response generation. speaker model encodes personas distributed embeddings capture individual characteristics background information speaking style. dyadic speaker-addressee model captures properties interactions two interlocutors. models yield qualitative performance improvements perplexity bleu scores baseline sequence-to-sequence models, similar gains speaker consistency measured human judges.",4 "automatic summarization online debates. debate summarization one novel challenging research areas automatic text summarization largely unexplored. paper, develop debate summarization pipeline summarize key topics discussed argued two opposing sides online debates. view generation debate summaries achieved clustering, cluster labeling, visualization. work, investigate two different clustering approaches generation summaries. first approach, generate summaries applying purely term-based clustering cluster labeling. second approach makes use x-means clustering mutual information labeling clusters. approaches driven ontologies. visualize results using bar charts. think results smooth entry users aiming receive first impression discussed within debate topic containing waste number argumentations.",4 fuzzy vault fingerprints vulnerable brute force attack. \textit{fuzzy vault} approach one best studied well accepted ideas binding cryptographic security biometric authentication. vault implemented connection fingerprint data uludag jain. show instance vault vulnerable brute force attack. interceptor vault data recover secret template data using generally affordable computational resources. possible alternatives discussed suggested cryptographic security may preferable one - way function approach biometric security.,4 "hacking smart machines smarter ones: extract meaningful data machine learning classifiers. machine learning (ml) algorithms used train computers perform variety complex tasks improve experience. computers learn recognize patterns, make unintended decisions, react dynamic environment. certain trained machines may effective others based suitable ml algorithms trained superior training sets. although ml algorithms known publicly released, training sets may reasonably ascertainable and, indeed, may guarded trade secrets. much research performed privacy elements training sets, paper focus attention ml classifiers statistical information unconsciously maliciously revealed them. show possible infer unexpected useful information ml classifiers. particular, build novel meta-classifier train hack classifiers, obtaining meaningful information training sets. kind information leakage exploited, example, vendor build effective classifiers simply acquire trade secrets competitor's apparatus, potentially violating intellectual property rights.",4 "iterative object part transfer fine-grained recognition. aim fine-grained recognition identify sub-ordinate categories images like different species birds. existing works confirmed that, order capture subtle differences across categories, automatic localization objects parts critical. approaches object part localization relied bottom-up pipeline, thousands region proposals generated filtered pre-trained object/part models. computationally expensive scalable number objects/parts becomes large. paper, propose nonparametric data-driven method object part localization. given unlabeled test image, approach transfers annotations similar images retrieved training set. particular, propose iterative transfer strategy gradually refine predicted bounding boxes. based located objects parts, deep convolutional features extracted recognition. evaluate approach widely-used cub200-2011 dataset new large dataset called birdsnap. datasets, achieve better results many state-of-the-art approaches, including using oracle (manually annotated) bounding boxes test images.",4 "$\mathcal{o}(n\log n)$ projection operator weighted $\ell_1$-norm regularization sum constraint. provide simple efficient algorithm projection operator weighted $\ell_1$-norm regularization subject sum constraint, together elementary proof. implementation proposed algorithm downloaded author's homepage.",4 "bounded recursive self-improvement. designed machine becomes increasingly better behaving underspecified circumstances, goal-directed way, job, modeling environment experience accumulates. based principles autocatalysis, endogeny, reflectivity, work provides architectural blueprint constructing systems high levels operational autonomy underspecified circumstances, starting small seed. value-driven dynamic priority scheduling controlling parallel execution vast number reasoning threads, system achieves recursive self-improvement leaves lab, within boundaries imposed designers. prototype system implemented demonstrated learn complex real-world task, real-time multimodal dialogue humans, on-line observation. work presents solutions several challenges must solved achieving artificial general intelligence.",4 noisy expectation-maximization: applications generalizations. present noise-injected version expectation-maximization (em) algorithm: noisy expectation maximization (nem) algorithm. nem algorithm uses noise speed convergence em algorithm. nem theorem shows injected noise speeds average convergence em algorithm local maximum likelihood surface positivity condition holds. generalized form noisy expectation-maximization (nem) algorithm allow arbitrary modes noise injection including adding multiplying noise data. demonstrate noise benefits em algorithms gaussian mixture model (gmm) additive multiplicative nem noise injection. separate theorem (not presented here) shows noise benefit independent identically distributed additive noise decreases sample size mixture models. theorem implies noise benefit pronounced data sparse. injecting blind noise slowed convergence.,19 "theoretical analysis ndcg type ranking measures. central problem ranking design ranking measure evaluation ranking functions. paper study, theoretical perspective, widely used normalized discounted cumulative gain (ndcg)-type ranking measures. although extensive empirical studies ndcg, little known theoretical properties. first show that, whatever ranking function is, standard ndcg adopts logarithmic discount, converges 1 number items rank goes infinity. first sight, result surprising. seems imply ndcg cannot differentiate good bad ranking functions, contradicting empirical success ndcg many applications. order deeper understanding ranking measures general, propose notion referred consistent distinguishability. notion captures intuition ranking measure property: every pair substantially different ranking functions, ranking measure decide one better consistent manner almost datasets. show ndcg logarithmic discount consistent distinguishability although converges limit ranking functions. next characterize set feasible discount functions ndcg according concept consistent distinguishability. specifically show whether ndcg consistent distinguishability depends fast discount decays, 1/r critical point. turn cut-off version ndcg, i.e., ndcg@k. analyze distinguishability ndcg@k various choices k discount functions. experimental results real web search datasets agree well theory.",4 "hierarchical latent word clustering. paper presents new bayesian non-parametric model extending usage hierarchical dirichlet allocation extract tree structured word clusters text data. inference algorithm model collects words cluster share similar distribution documents. experiments, observed meaningful hierarchical structures nips corpus radiology reports collected public repositories.",4 "bio-inspired data mining: treating malware signatures biosequences. application machine learning bioinformatics problems well established. less well understood application bioinformatics techniques machine learning and, particular, representation non-biological data biosequences. aim paper explore effects giving amino acid representation problematic machine learning data evaluate benefits supplementing traditional machine learning bioinformatics tools techniques. signatures 60 computer viruses 60 computer worms converted amino acid representations first multiply aligned separately identify conserved regions across different families within class (virus worm). followed second alignment 120 aligned signatures together non-conserved regions identified prior input number machine learning techniques. differences length virus worm signatures first alignment resolved second alignment. first set experiments indicates representing computer malware signatures amino acid sequences followed alignment leads greater classification prediction accuracy. second set experiments indicates checking results data mining artificial virus worm data known proteins lead generalizations made domain naturally occurring proteins malware signatures. however, work needed determine advantages disadvantages different representations sequence alignment methods handling problematic machine learning data.",4 "fast rates bandit optimization upper-confidence frank-wolfe. consider problem bandit optimization, inspired stochastic optimization online learning problems bandit feedback. problem, objective minimize global loss function actions, necessarily cumulative loss. framework allows us study general class problems, applications statistics, machine learning, fields. solve problem, analyze upper-confidence frank-wolfe algorithm, inspired techniques bandits convex optimization. give theoretical guarantees performance algorithm various classes functions, discuss optimality results.",4 "mutual kernel matrix completion. huge influx various data nowadays, extracting knowledge become interesting tedious task among data scientists, particularly data come heterogeneous form missing information. many data completion techniques introduced, especially advent kernel methods. however, among many data completion techniques available literature, studies mutually completing several incomplete kernel matrices given much attention yet. paper, present new method, called mutual kernel matrix completion (mkmc) algorithm, tackles problem mutually inferring missing entries multiple kernel matrices combining notions data fusion kernel matrix completion, applied biological data sets used classification task. first introduced objective function minimized exploiting em algorithm, turn results estimate missing entries kernel matrices involved. completed kernel matrices combined produce model matrix used improve obtained estimates. interesting result study e-step m-step given closed form, makes algorithm efficient terms time memory. completion, (completed) kernel matrices used train svm classifier test well relationships among entries preserved. empirical results show proposed algorithm bested traditional completion techniques preserving relationships among data points, accurately recovering missing kernel matrix entries. far, mkmc offers promising solution problem mutual estimation number relevant incomplete kernel matrices.",4 "fractal structures adversarial prediction. fractals self-similar recursive structures used modeling several real world processes. work study ""fractal-like"" processes arise prediction game adversary generating sequence bits algorithm trying predict them. see certain formalization predictive payoff algorithm optimal adversary produce fractal-like sequence minimize algorithm's ability predict. indeed suggested financial markets exhibit fractal-like behavior. prove fractal-like distribution arises naturally optimization adversary's perspective. addition, give optimal trade-offs predictability expected deviation (i.e. sum bits) formalization predictive payoff. result motivated observation several time series data exhibit higher deviations expected completely random walk.",4 "oblivious branching programs bounded repetition cannot efficiently compute cnfs bounded treewidth. paper study complexity extension ordered binary decision diagrams (obdds) called $c$-obdds cnfs bounded (primal graph) treewidth. particular, show $k$ class cnfs treewidth $k \geq 3$ equivalent $c$-obdds size $\omega(n^{k/(8c-4)})$. moreover, lower bound holds $c$-obdd non-deterministic semantic. second result uses lower bound separate model sentential decision diagrams (sdds). order obtain lower bound, use structural graph parameter called matching width. third result shows matching width pathwidth linearly related.",4 "mining process model descriptions daily life event abstraction. process mining techniques focus extracting insight processes event logs. process mining potential provide valuable insights (un)healthy habits contribute ambient assisted living solutions applied data smart home environments. however, events recorded smart home environments level sensor triggers, process discovery algorithms produce overgeneralizing process models allow much behavior difficult interpret human experts. show abstracting events higher-level interpretation enable discovery precise comprehensible models. present framework extraction features used abstraction supervised learning methods based xes ieee standard event logs. framework automatically abstract sensor-level events interpretation human activity level, training training data sensor human activity events known. demonstrate abstraction framework three real-life smart home event logs show process models discovered abstraction precise indeed.",4 "learning sparse deep feedforward networks via tree skeleton expansion. despite popularity deep learning, structure learning deep models remains relatively under-explored area. contrast, structure learning studied extensively probabilistic graphical models (pgms). particular, efficient algorithm developed learning class tree-structured pgms called hierarchical latent tree models (hltms), layer observed variables bottom multiple layers latent variables top. paper, propose simple method learning structures feedforward neural networks (fnns) based hltms. idea expand connections tree skeletons hltms use resulting structures fnns. important characteristic fnn structures learned way sparse. present extensive empirical results show that, compared standard fnns tuned-manually, sparse fnns learned method achieve better comparable classification performance much fewer parameters. also interpretable.",4 "pos tagger code mixed indian social media text - icon-2016 nlp tools contest entry surukam. building part-of-speech (pos) taggers code-mixed indian languages particularly challenging problem computational linguistics due dearth accurately annotated training corpora. icon, part nlp tools contest organized challenge shared task second consecutive year improve state-of-the-art. paper describes pos tagger built surukam predict coarse-grained fine-grained pos tags three language pairs - bengali-english, telugu-english hindi-english, text spanning three popular social media platforms - facebook, whatsapp twitter. employed conditional random fields sequence tagging algorithm used library called sklearn-crfsuite - thin wrapper around crfsuite training model. among features used include - character n-grams, language information patterns emoji, number, punctuation web-address. submissions constrained environment,i.e., without making use monolingual pos taggers like, obtained overall average f1-score 76.45%, comparable 2015 winning score 76.79%.",4 "fast rhetorical structure theory discourse parsing. recent years, variety research discourse parsing, particularly rst discourse parsing. recent work rst parsing focused implementing new types features learning algorithms order improve accuracy, relatively little focus efficiency, robustness, practical use. also, implementations widely available. here, describe rst segmentation parsing system adapts models feature sets various previous work, described below. accuracy near state-of-the-art, developed fast, robust, practical. example, process short documents news articles essays less second.",4 "analysis spectrum occupancy using machine learning algorithms. paper, analyze spectrum occupancy using different machine learning techniques. supervised techniques (naive bayesian classifier (nbc), decision trees (dt), support vector machine (svm), linear regression (lr)) unsupervised algorithm (hidden markov model (hmm)) studied find best technique highest classification accuracy (ca). detailed comparison supervised unsupervised algorithms terms computational time classification accuracy performed. classified occupancy status utilized evaluate probability secondary user outage future time slots, used system designers define spectrum allocation spectrum sharing policies. numerical results show svm best algorithm among supervised unsupervised classifiers. based this, proposed new svm algorithm combining fire fly algorithm (ffa), shown outperform algorithms.",4 "identifying purpose behind electoral tweets. tweets pertaining single event, national election, number hundreds millions. automatically analyzing beneficial many downstream natural language applications question answering summarization. paper, propose new task: identifying purpose behind electoral tweets--why people post election-oriented tweets? show identifying purpose correlated related phenomenon sentiment emotion detection, yet significantly different. detecting purpose number applications including detecting mood electorate, estimating popularity policies, identifying key issues contention, predicting course events. create large dataset electoral tweets annotate thousand tweets purpose. develop system automatically classifies electoral tweets per purpose, obtaining accuracy 43.56% 11-class task accuracy 73.91% 3-class task (both accuracies well most-frequent-class baseline). finally, show resources developed emotion detection also helpful detecting purpose.",4 "ordinal rating network performance inference matrix completion. paper addresses large-scale acquisition end-to-end network performance. made two distinct contributions: ordinal rating network performance inference matrix completion. former reduces measurement costs unifies various metrics eases processing applications. latter enables scalable accurate inference requirement structural information network geometric constraints. combining both, acquisition problem bears strong similarities recommender systems. paper investigates applicability various matrix factorization models used recommender systems. found simple regularized matrix factorization practical also produces accurate results beneficial peer selection.",4 "evolving intraday foreign exchange trading strategies utilizing multiple instruments price series. propose genetic programming architecture generation foreign exchange trading strategies. system's principal features evolution free-form strategies rely prior models utilization price series multiple instruments input data. latter feature constitutes innovation respect previous works documented literature. article utilize open, high, low, close bar data 5 minutes frequency aud.usd, eur.usd, gbp.usd usd.jpy currency pairs. test implementation analyzing in-sample out-of-sample performance strategies trading usd.jpy obtained across multiple algorithm runs. also evaluate differences strategies selected according two different criteria: one relies fitness obtained training set only, second one makes use additional validation dataset. strategy activity trade accuracy remarkably stable sample results. profitability aspect, two criteria result strategies successful out-of-sample data exhibiting different characteristics. overall best performing out-of-sample strategy achieves yearly return 19%.",4 "fear bit flips: optimized coding strategies binary classification. trained, classifiers must often operate data corrupted noise. paper, consider impact noise features binary classifiers. inspired tools classifier robustness, introduce classification probability (scp) measure resulting distortion classifier outputs. introduce low-complexity estimate scp based quantization polynomial multiplication. also study channel coding techniques based replication error-correcting codes. contrast traditional channel coding approach, error-correction meant preserve data agnostic application, schemes specifically aim maximize scp (equivalently minimizing distortion classifier output) redundancy overhead.",19 "sleeping beauty reconsidered: conditioning reflection asynchronous systems. careful analysis conditioning sleeping beauty problem done, using formal model reasoning knowledge probability developed halpern tuttle. sleeping beauty problem viewed revealing problems conditioning presence imperfect recall, analysis done reveals problems much due imperfect recall asynchrony. implications analysis van fraassen's reflection principle savage's sure-thing principle considered.",4 "discussion among different methods updating model filter object tracking. discriminative correlation filters (dcf) recently shown excellent performance visual object tracking area. paper, summarize methods updating model filter discriminative correlation filter (dcf) based tracking algorithms analyzes similarities differences among methods. deduce relationship among updating coefficient high dimension (kernel trick), updating filter frequency domain updating filter spatial domain, analyze difference among different ways. also analyze difference updating filter directly updating filter's numerator (object response power) updating filter's denominator (filter's power). experiments comparing different updating methods visualizing template filters used prove derivation.",4 "finding influential training samples gradient boosted decision trees. address problem finding influential training samples particular case tree ensemble-based models, e.g., random forest (rf) gradient boosted decision trees (gbdt). natural way formalizing problem studying model's predictions change upon leave-one-out retraining, leaving individual training sample. recent work shown that, parametric models, analysis conducted computationally efficient way. propose several ways extending framework non-parametric gbdt ensembles assumption tree structures remain fixed. furthermore, introduce general scheme obtaining approximations method balance trade-off performance computational complexity. evaluate approaches various experimental setups use-case scenarios demonstrate quality approach finding influential training samples comparison baselines computational efficiency.",4 "political homophily independence movements: analysing classifying social media users national identity. social media data mining increasingly used analyse political societal issues. undertake classification social media users supporting opposing ongoing independence movements territories. independence movements occur territories whose citizens conflicting national identities; users opposing national identities support oppose sense part independent nation differs officially recognised country. describe methodology relies users' self-reported location build large-scale datasets three territories -- catalonia, basque country scotland. analysis datasets shows homophily plays important role determining people connect with, users predominantly choose follow interact others national identity. show classifier relying users' follow networks achieve accurate, language-independent classification performances ranging 85% 97% three territories.",4 "improved speech reconstruction silent video. speechreading task inferring phonetic information visually observed articulatory facial movements, notoriously difficult task humans perform. paper present end-to-end model based convolutional neural network (cnn) generating intelligible natural-sounding acoustic speech signal silent video frames speaking person. train model speakers grid tcd-timit datasets, evaluate quality intelligibility reconstructed speech using common objective measurements. show speech predictions proposed model attain scores indicate significantly improved quality existing models. addition, show promising results towards reconstructing speech unconstrained dictionary.",4 "translating answer-set programs bit-vector logic. answer set programming (asp) paradigm declarative problem solving problems first formalized rule sets, i.e., answer-set programs, uniform way solved computing answer sets programs. satisfiability modulo theories (smt) framework follows similar modelling philosophy syntax based extensions propositional logic rather rules. quite recently, translation answer-set programs difference logic provided---enabling use particular smt solvers computation answer sets. paper, translation revised another smt fragment, namely based fixed-width bit-vector theories. thus, even smt solvers harnessed task computing answer sets. results preliminary experimental comparison also reported. suggest level performance similar achieved via difference logic.",4 "novel energy aware node clustering algorithm wireless sensor networks using modified artificial fish swarm algorithm. clustering problems considered amongst prominent challenges statistics computational science. clustering nodes wireless sensor networks used prolong life-time networks one difficult tasks clustering procedure. order perform nodes clustering, number nodes determined cluster heads ones joined one heads, based different criteria e.g. euclidean distance. far, different approaches proposed process, swarm evolutionary algorithms contribute regard. study, novel algorithm proposed based artificial fish swarm algorithm (afsa) clustering procedure. proposed method, performance standard afsa improved increasing balance local global searches. furthermore, new mechanism added base algorithm improving convergence speed clustering problems. performance proposed technique compared number state-of-the-art techniques field outcomes indicate supremacy proposed technique.",4 "value iteration algorithm strongly polynomial discounted dynamic programming. note provides simple example demonstrating that, exact computations allowed, number iterations required value iteration algorithm find optimal policy discounted dynamic programming problems may grow arbitrarily quickly size problem. particular, number iterations exponential number actions. thus, unlike policy iterations, value iteration algorithm strongly polynomial discounted dynamic programming.",4 "notes information geometry evolutionary processes. order analyze extract different structural properties distributions, one introduce different coordinate systems manifold distributions. evolutionary computation, walsh bases building block bases often used describe populations, simplifies analysis evolutionary operators applying populations. quite independent approaches, information geometry developed geometric way analyze different order dependencies random variables (e.g., neural activations genes). notes briefly review essentials various coordinate bases information geometry. goal give overview make approaches comparable. besides introducing meaningful coordinate bases, information geometry also offers explicit way distinguish different order interactions offers geometric view manifold thereby also operators apply manifold. instance, uniform crossover interpreted orthogonal projection population along m-geodesic, monotonously reducing theta-coordinates describe interactions genes.",13 "information extraction approach prescreen heart failure patients clinical trials. reduce large amount time spent screening, identifying, recruiting patients clinical trials, need prescreening systems able automate data extraction decision-making tasks typically relegated clinical research study coordinators. however, major obstacle vast amount patient data available unstructured free-form text electronic health records. propose information extraction-based approach first automatically converts unstructured text structured form. structured data compared list eligibility criteria using rule-based system determine patients qualify enrollment heart failure clinical trial. show achieve highly accurate results, recall precision values 0.95 0.86, respectively. system allowed us significantly reduce time needed prescreening patients weeks minutes. open-source information extraction modules available researchers could tested validated cardiovascular trials. approach one demonstrate may decrease costs expedite clinical trials, could enhance reproducibility trials across institutions populations.",4 "deep neural networks match related objects?: survey imagenet-trained classification models. deep neural networks (dnns) shown state-of-the-art level performances wide range complicated tasks. recent years, studies actively conducted analyze black box characteristics dnns grasp learning behaviours, tendency, limitations dnns. paper, investigate limitation dnns image classification task verify method inspired cognitive psychology. analyzing failure cases imagenet classification task, hypothesize dnns sufficiently learn associate related classes objects. verify dnns understand relatedness object classes, conducted experiments image database provided cognitive psychology. applied imagenet-trained dnns database consisting pairs related unrelated object images compare feature similarities determine whether pairs match other. experiments, observed dnns show limited performance determining relatedness object classes. addition, dnns present somewhat improved performance discovering relatedness based similarity, perform weaker discovering relatedness based association. experiments, novel analysis learning behaviour dnns provided limitation needs overcome suggested.",4 "recognizing textures mobile cameras pedestrian safety applications. smartphone rooted distractions become commonplace, lack compelling safety measures led rise number injuries distracted walkers. various solutions address problem sensing pedestrian's walking environment. existing camera-based approaches largely limited obstacle detection forms object detection. instead, present terrafirma, approach performs material recognition pedestrian's walking surface. explore, first, well commercial off-the-shelf smartphone cameras learn texture distinguish among paving materials uncontrolled outdoor urban settings. second, aim identifying distracted user enter street, used support safety functions warning user cautious. end, gather unique dataset street/sidewalk imagery pedestrian's perspective, spans major cities like new york, paris, london. demonstrate modern phone cameras enabled distinguish materials walking surfaces urban areas 90% accuracy, accurately identify pedestrians transition sidewalk street.",4 "quality geographic information: ontological approach artificial intelligence tools. objective present one important aspect european ist-fet project ""rev!gis""1: methodology developed translation (interpretation) quality data ""fitness use"" information, confront user needs application. methodology based upon notion ""ontologies"" conceptual framework able capture explicit implicit knowledge involved application. address general problem formalizing ontologies, instead, rather try illustrate three applications particular cases general ""data fusion"" problem. application, show deploy methodology, comparing several possible solutions, try enlighten quality issues, kind solution privilege, even expense highly complex computational approach. expectation rev!gis project computationally tractable solutions available among next generation ai tools.",4 "domain adaptation randomized expectation maximization. domain adaptation (da) task classifying unlabeled dataset (target) using labeled dataset (source) related domain. majority successful da methods try directly match distributions source target data transforming feature space. despite success, state art methods based approach either involved unable directly scale data many features. article shows domain adaptation successfully performed using simple randomized expectation maximization (em) method. consider two instances method, involve logistic regression support vector machine, respectively. underlying assumption proposed method existence good single linear classifier source target domain. potential limitations assumption alleviated flexibility method, directly incorporate deep features extracted pre-trained deep neural network. resulting algorithm strikingly easy implement apply. test performance 36 real-life adaptation tasks text image data diverse characteristics. method achieves state-of-the-art results, competitive involved end-to-end deep transfer-learning methods.",19 "adapting shifting intent search queries. search engines today present results often oblivious abrupt shifts intent. example, query `independence day' usually refers us holiday, intent query abruptly changed release major film name. studies exactly quantify magnitude intent-shifting traffic, studies suggest news events, seasonal topics, pop culture, etc account 50% search queries. paper shows signals search engine receives used determine shift intent happened, well find result relevant. present meta-algorithm marries classifier bandit algorithm achieve regret depends logarithmically number query impressions, certain assumptions. provide strong evidence regret close best achievable. finally, via series experiments, demonstrate algorithm outperforms prior approaches, particularly amount intent-shifting traffic increases.",4 "spp-net: deep absolute pose regression synthetic views. image based localization one important problems computer vision due wide applicability robotics, augmented reality, autonomous systems. rich set methods described literature geometrically register 2d image w.r.t.\ 3d model. recently, methods based deep (and convolutional) feedforward networks (cnns) became popular pose regression. however, cnn-based methods still less accurate geometry based methods despite fast memory efficient. work design deep neural network architecture based sparse feature descriptors estimate absolute pose image. choice using sparse feature descriptors two major advantages: first, network significantly smaller cnns proposed literature task---thereby making approach efficient scalable. second---and importantly---, usage sparse features allows augment training data synthetic viewpoints, leads substantial improvements generalization performance unseen poses. thus, proposed method aims combine best two worlds---feature-based localization cnn-based pose regression--to achieve state-of-the-art performance absolute pose estimation. detailed analysis proposed architecture rigorous evaluation existing datasets provided support method.",4 "tensor principal component analysis via sum-of-squares proofs. study statistical model tensor principal component analysis problem introduced montanari richard: given order-$3$ tensor $t$ form $t = \tau \cdot v_0^{\otimes 3} + a$, $\tau \geq 0$ signal-to-noise ratio, $v_0$ unit vector, $a$ random noise tensor, goal recover planted vector $v_0$. case $a$ iid standard gaussian entries, give efficient algorithm recover $v_0$ whenever $\tau \geq \omega(n^{3/4} \log(n)^{1/4})$, certify recovered vector close maximum likelihood estimator, high probability random choice $a$. previous best algorithms provable guarantees required $\tau \geq \omega(n)$. regime $\tau \leq o(n)$, natural tensor-unfolding-based spectral relaxations underlying optimization problem break (in sense integrality gap large). go beyond barrier, use convex relaxations based sum-of-squares method. recovery algorithm proceeds rounding degree-$4$ sum-of-squares relaxations maximum-likelihood-estimation problem statistical model. complement algorithmic results, show degree-$4$ sum-of-squares relaxations break $\tau \leq o(n^{3/4}/\log(n)^{1/4})$, demonstrates improving current guarantees (by logarithmic factors) would require new techniques might even intractable. finally, show exploit additional problem structure order solve sum-of-squares relaxations, approximation, efficiently. fastest algorithm runs nearly-linear time using shifted (matrix) power iteration similar guarantees above. analysis algorithm also confirms variant conjecture montanari richard singular vectors tensor unfoldings.",4 "constructing category-specific models monocular object-slam. present new paradigm real-time object-oriented slam monocular camera. contrary previous approaches, rely object-level models, construct category-level models cad collections widely available. alleviate need huge amounts labeled data, develop rendering pipeline enables synthesis large datasets limited amount manually labeled data. using data thus synthesized, learn category-level models object deformations 3d, well discriminative object features 2d. category models instance-independent aid design object landmark observations incorporated generic monocular slam framework. typical object-slam approaches usually solve object camera poses, also estimate object shape on-the-fly, allowing wide range objects category present scene. moreover, since 2d object features learned discriminatively, proposed object-slam system succeeds several scenarios sparse feature-based monocular slam fails due insufficient features parallax. also, proposed category-models help object instance retrieval, useful augmented reality (ar) applications. evaluate proposed framework multiple challenging real-world scenes show --- best knowledge --- first results instance-independent monocular object-slam system benefits enjoys feature-based slam methods.",4 "phase-only planar antenna array synthesis fuzzy genetic algorithms. paper describes new method synthesis planar antenna arrays using fuzzy genetic algorithms (fgas) optimizing phase excitation coefficients best meet desired radiation pattern. present application rigorous optimization technique based fuzzy genetic algorithms (fgas), optimizing algorithm obtained adjusting control parameters standard version genetic algorithm (sgas) using fuzzy controller (flc) depending best individual fitness population diversity measurements (pdm). presented optimization algorithms previously checked specific mathematical test function show superior capabilities respect standard version (sgas). planar array rectangular cells using probe feed considered. included example using fga demonstrates good agreement desired calculated radiation patterns obtained sga.",4 "domain-independent algorithm plan adaptation. paradigms transformational planning, case-based planning, plan debugging involve process known plan adaptation - modifying repairing old plan solves new problem. paper provide domain-independent algorithm plan adaptation, demonstrate sound, complete, systematic, compare adaptation algorithms literature. approach based view planning searching graph partial plans. generative planning starts graph's root moves node node using plan-refinement operators. planning adaptation, library plan - arbitrary node plan graph - starting point search, plan-adaptation algorithm apply refinement operators available generative planner also retract constraints steps plan. algorithm's completeness ensures adaptation algorithm eventually search entire graph systematicity ensures without redundantly searching parts graph.",4 "optical images-based edge detection synthetic aperture radar images. address issue adapting optical images-based edge detection techniques use polarimetric synthetic aperture radar (polsar) imagery. modify gravitational edge detection technique (inspired law universal gravity) proposed lopez-molina et al, using non-standard neighbourhood configuration proposed fu et al, reduce speckle noise polarimetric sar imagery. compare modified unmodified versions gravitational edge detection technique well-established one proposed canny, well recent multiscale fuzzy-based technique proposed lopez-molina et alejandro also address issues aggregation gray level images edge detection filtering. techniques addressed applied mosaic built using class distributions obtained real scene, well true polsar image; mosaic results assessed using baddeley's delta metric. experiments show modifying gravitational edge detection technique non-standard neighbourhood configuration produces better results original technique, well techniques used comparison. experiments show adapting edge detection methods computational intelligence use polsar imagery new field worthy exploration.",4 "minimally faithful inversion graphical models. inference amortization methods allow sharing statistical strength across related observations learning perform posterior inference. generally requires inversion dependency structure generative model, modeller must design learn distribution approximate posterior. previous methods invert dependency structure heuristic way fail capture dependencies model, therefore limiting performance eventual inference algorithm. introduce algorithm faithfully minimally inverting graphical model structure generative model. inversion two crucial properties: a) encode independence assertions absent model, b) given inversion, encodes many true independence assertions possible. algorithm works simulating variable elimination generative model reparametrize distribution. show experiments minimal inversions assist performing better inference.",19 "towards new science clinical data intelligence. paper define clinical data intelligence analysis data generated clinical routine goal improving patient care. define science clinical data intelligence data analysis permits derivation scientific, i.e., generalizable reliable results. argue science clinical data intelligence sensible context big data analysis, i.e., data many patients complete patient information. discuss clinical data intelligence requires joint efforts knowledge engineering, information extraction (from textual unstructured data), statistics statistical machine learning. describe main results conjectures relate recently funded research project involving two major german university hospitals.",4 "uncertainty measurement belief entropy interference effect quantum-like bayesian networks. social dilemmas regarded essence evolution game theory, prisoner's dilemma game famous metaphor problem cooperation. recent findings revealed people's behavior violated sure thing principle games. classic probability methodologies difficulty explaining underlying mechanisms people's behavior. paper, novel quantum-like bayesian network proposed accommodate paradoxical phenomenon. special network take interference consideration, likely efficient way describe underlying mechanism. assistance belief entropy, named deng entropy, paper proposes belief distance render model practical. tested empirical data, proposed model proved predictable effective.",4 "human-in-the-loop artificial intelligence. little little, newspapers revealing bright future artificial intelligence (ai) building. intelligent machines help everywhere. however, bright future dark side: dramatic job market contraction unpredictable transformation. hence, near future, large numbers job seekers need financial support catching novel unpredictable jobs. possible job market crisis antidote inside. fact, rise ai sustained biggest knowledge theft recent years. learning ai machines extracting knowledge unaware skilled unskilled workers analyzing interactions. passionately jobs, workers digging graves. paper, propose human-in-the-loop artificial intelligence (hit-ai) fairer paradigm artificial intelligence systems. hit-ai reward aware unaware knowledge producers different scheme: decisions ai systems generating revenues repay legitimate owners knowledge used taking decisions. modern robin hoods, hit-ai researchers fight fairer artificial intelligence gives back steals.",4 "using state space differential geometry nonlinear blind source separation. given time series multicomponent measurements evolving stimulus, nonlinear blind source separation (bss) seeks find ""source"" time series, comprised statistically independent combinations measured components. paper, seek source time series local velocity cross correlations vanish everywhere stimulus state space. however, earlier paper local velocity correlation matrix shown constitute metric state space. therefore, nonlinear bss maps onto problem differential geometry: given metric observed measurement coordinate system, find another coordinate system metric diagonal everywhere. show determine observed data separable way, and, are, show construct required transformation source coordinate system, essentially unique except unknown rotation found applying methods linear bss. thus, proposed technique solves nonlinear bss many situations or, least, reduces linear bss, without use probabilistic, parametric, iterative procedures. paper also describes generalization methodology performs nonlinear independent subspace separation. every case, resulting decomposition observed data intrinsic property stimulus' evolution sense depend way observer chooses view (e.g., choice observing machine's sensors). words, decomposition property evolution ""real"" stimulus ""out there"" broadcasting energy observer. technique illustrated analytic numerical examples.",4 "research multiple feature fusion image retrieval algorithm based texture feature rough set theory. recently, witnessed explosive growth images complex information content. order effectively precisely retrieve desired images large-scale image database low time-consuming, propose multiple feature fusion image retrieval algorithm based texture feature rough set theory paper. contrast conventional approaches use single feature standard, fuse different features operation normalization. rough set theory assist us enhance robustness retrieval system facing incomplete data warehouse. enhance texture extraction paradigm, use wavelet gabor function holds better robustness. addition, perspectives internal external normalization, re-organize extracted feature better combination. numerical experiment verified general feasibility methodology. enhance overall accuracy compared state-of-the-art algorithms.",4 "inversenet: solving inverse problems splitting networks. propose new method uses deep learning techniques solve inverse problems. inverse problem cast form learning end-to-end mapping observed data ground-truth. inspired splitting strategy widely used regularized iterative algorithm tackle inverse problems, mapping decomposed two networks, one handling inversion physical forward model associated data term one handling denoising output former network, i.e., inverted version, associated prior/regularization term. two networks trained jointly learn end-to-end mapping, getting rid two-step training. training annealing intermediate variable two networks bridges gap input (the degraded version output) output progressively approaches ground-truth. proposed network, referred inversenet, flexible sense existing end-to-end network structure leveraged first network existing denoising network structure used second one. extensive experiments synthetic data real datasets tasks, motion deblurring, super-resolution, colorization, demonstrate efficiency accuracy proposed method compared image processing algorithms.",4 "automatic mapping french discourse connectives pdtb discourse relations. paper, present approach exploit phrase tables generated statistical machine translation order map french discourse connectives discourse relations. using approach, created concoledisco, lexicon french discourse connectives pdtb relations. evaluated lexconn, concoledisco achieves recall 0.81 average precision 0.68 concession condition relations.",4 "measuring relations concepts conceptual spaces. highly influential framework conceptual spaces provides geometric way representing knowledge. instances represented points high-dimensional space concepts represented regions space. recent mathematical formalization framework capable representing correlations different domains geometric way. paper, extend formalization providing quantitative mathematical definitions notions concept size, subsethood, implication, similarity, betweenness. considerably increases representational power formalization introducing measurable ways describing relations concepts.",4 "factored particles scalable monitoring. exact monitoring dynamic bayesian networks intractable, approximate algorithms necessary. paper presents new family approximate monitoring algorithms combine best qualities particle filtering boyen-koller methods. algorithms maintain approximate representation belief state form sets factored particles, correspond samples clusters state variables. empirical results show algorithms outperform ordinary particle filtering boyen-koller algorithm large systems.",4 "generative model group conversation. conversations non-player characters (npcs) games typically confined dialogue human player virtual agent, conversation initiated controlled player. create richer, believable environments players, need conversational behavior reflect initiative part npcs, including conversations include multiple npcs interact one another well player. describe generative computational model group conversation agents, abstract simulation discussion small group setting. define conversational interactions terms rules turn taking interruption, well belief change, sentiment change, emotional response, dependent agent personality, context, relationships. evaluate model using parameterized expressive range analysis, observing correlations simulation parameters features resulting conversations. analysis confirms, example, character personalities predict often speak, heterogeneous groups characters generate belief change.",4 "complexity optimized crossover binary representations. consider computational complexity producing best possible offspring crossover, given two solutions parents. crossover operators studied class boolean linear programming problems, boolean vector variables used solution representation. means efficient reductions optimized gene transmitting crossover problems (ogtc) show polynomial solvability ogtc maximum weight set packing problem, minimum weight set partition problem one versions simple plant location problem. study connection ogtc linear boolean programming problem maximum weight independent set problem 2-colorable hypergraph prove np-hardness several special cases ogtc problem boolean linear programming.",4 "time-dependent hierarchical dirichlet model timeline generation. timeline generation aims summarizing news different epochs telling readers event evolves. new challenge combines salience ranking novelty detection. long-term public events, main topic usually includes various aspects across different epochs aspect evolving pattern. existing approaches neglect hierarchical topic structure involved news corpus timeline generation. paper, develop novel time-dependent hierarchical dirichlet model (hdm) timeline generation. model aptly detect different levels topic information across corpus structure used sentence selection. based topic mined fro hdm, sentences selected considering different aspects relevance, coherence coverage. develop experimental systems evaluate 8 long-term events public concern. performance comparison different systems demonstrates effectiveness model terms rouge metrics.",4 "supervised learning multilayer spiking neural networks. current article introduces supervised learning algorithm multilayer spiking neural networks. algorithm presented overcomes limitations existing learning algorithms applied neurons firing multiple spikes principle applied linearisable neuron model. algorithm applied successfully various benchmarks, xor problem iris data set, well complex classifications problems. simulations also show flexibility supervised learning algorithm permits different encodings spike timing patterns, including precise spike trains encoding.",4 "semi-automatic algorithm breast mri lesion segmentation using marker-controlled watershed transformation. magnetic resonance imaging (mri) effective imaging modality identifying localizing breast lesions women. accurate precise lesion segmentation using computer-aided-diagnosis (cad) system, crucial step evaluating tumor volume quantification tumor characteristics. however, challenging task, since breast lesions sophisticated shape, topological structure, high variance intensity distribution across patients. paper, propose novel marker-controlled watershed transformation-based approach, uses brightest pixels region interest (determined experts) markers overcome challenge, accurately segment lesions breast mri. proposed approach evaluated 106 lesions, includes 64 malignant 42 benign cases. segmentation results quantified comparison ground truth labels, using dice similarity coefficient (dsc) jaccard index (ji) metrics. proposed method achieved average dice coefficient 0.7808$\pm$0.1729 jaccard index 0.6704$\pm$0.2167. results illustrate proposed method shows promise future work related segmentation classification benign malignant breast lesions.",4 "editorial first workshop mining scientific papers: computational linguistics bibliometrics. workshop ""mining scientific papers: computational linguistics bibliometrics"" (clbib 2015), co-located 15th international society scientometrics informetrics conference (issi 2015), brought together researchers bibliometrics computational linguistics order study ways bibliometrics benefit large-scale text analytics sense mining scientific papers, thus exploring interdisciplinarity bibliometrics natural language processing (nlp). goals workshop answer questions like: enhance author network analysis bibliometrics using data obtained text analytics? insights nlp provide structure scientific writing, citation networks, in-text citation analysis? workshop first step foster reflection interdisciplinarity benefits two disciplines bibliometrics natural language processing drive it.",4 "$k$-center clustering perturbation resilience. $k$-center problem canonical long-studied facility location clustering problem many applications symmetric asymmetric forms. versions problem tight approximation factors worst case instances: $2$-approximation symmetric $k$-center $o(\log^*(k))$-approximation asymmetric version. work, go beyond worst case provide strong positive results asymmetric symmetric $k$-center problems natural input stability (promise) condition called $\alpha$-perturbation resilience (bilu & linial 2012) , states optimal solution change $\alpha$-factor perturbation input distances. show assuming 2-perturbation resilience, exact solution asymmetric $k$-center problem found polynomial time. knowledge, first problem hard approximate constant factor worst case, yet optimally solved polynomial time perturbation resilience constant value $\alpha$. furthermore, prove result tight showing symmetric $k$-center $(2-\epsilon)$-perturbation resilience hard unless $np=rp$. first tight result problem perturbation resilience, i.e., first time exact value $\alpha$ problem switches np-hard efficiently computable found. results illustrate surprising relationship symmetric asymmetric $k$-center instances perturbation resilience. unlike approximation ratio, symmetric $k$-center easily solved factor $2$ asymmetric $k$-center cannot approximated constant factor, symmetric asymmetric $k$-center solved optimally resilience 2-perturbations.",4 "image restoration using autoencoding priors. propose leverage denoising autoencoder networks priors address image restoration problems. build key observation output optimal denoising autoencoder local mean true data density, autoencoder error (the difference output input trained autoencoder) mean shift vector. use magnitude mean shift vector, is, distance local mean, negative log likelihood natural image prior. image restoration, maximize likelihood using gradient descent backpropagating autoencoder error. key advantage approach need train separate networks different image restoration tasks, non-blind deconvolution different kernels, super-resolution different magnification factors. demonstrate state art results non-blind deconvolution super-resolution using autoencoding prior.",4 "home: household multimodal environment. introduce home: household multimodal environment artificial agents learn vision, audio, semantics, physics, interaction objects agents, within realistic context. home integrates 45,000 diverse 3d house layouts based suncg dataset, scale may facilitate learning, generalization, transfer. home open-source, openai gym-compatible platform extensible tasks reinforcement learning, language grounding, sound-based navigation, robotics, multi-agent learning, more. hope home better enables artificial agents learn humans do: interactive, multimodal, richly contextualized setting.",4 "classification ultrahigh-dimensional features. although much progress made classification high-dimensional features \citep{fan_fan:2008, jguo:2010, caisun:2014, prxu:2014}, classification ultrahigh-dimensional features, wherein features much outnumber sample size, defies existing work. paper introduces novel computationally feasible multivariate screening classification method ultrahigh-dimensional data. leveraging inter-feature correlations, proposed method enables detection marginally weak sparse signals recovery true informative feature set, achieves asymptotic optimal misclassification rates. also show proposed procedure provides powerful discovery boundaries compared \citet{caisun:2014} \citet{jjin:2009}. performance proposed procedure evaluated using simulation studies demonstrated via classification patients different post-transplantation renal functional types.",19 "multiset model multi-species evolution solve big deceptive problems. chapter presents smuga, integration symbiogenesis multiset genetic algorithm (muga). symbiogenetic approach used based host-parasite model novelty varying length parasites along evolutionary process. additionally, models collaborations multiple parasites single host. improve efficiency, introduced proxy evaluation parasites, saves fitness function calls exponentially reduces symbiotic collaborations produced. another novel feature consists breaking evolutionary cycle two phases: symbiotic phase phase independent evolution hosts parasites. smuga tested optimization variety deceptive functions, results one order magnitude better state art symbiotic algorithms. allowed optimize deceptive problems large sizes, showed linear scaling number iterations attain optimum.",4 "heuristic method generate better initial population evolutionary methods. initial population plays important role heuristic algorithms ga help decrease time algorithms need achieve acceptable result. furthermore, may influence quality final answer given evolutionary algorithms. paper, shall introduce heuristic method generate target based initial population possess two mentioned characteristics. efficiency proposed method shown presenting results tests benchmarks.",4 "local contrast learning. learning deep model small data yet opening challenging problem. focus one-shot classification deep learning approach based small quantity training samples. proposed novel deep learning approach named local contrast learning (lcl) based key insight human cognitive behavior human recognizes objects specific context contrasting objects context her/his memory. lcl used train deep model contrast recognizing sample couple contrastive samples randomly drawn shuffled. one-shot classification task omniglot, deep model based lcl 122 layers 1.94 millions parameters, trained tiny dataset 60 classes 20 samples per class, achieved accuracy 97.99% outperforms human state-of-the-art established bayesian program learning (bpl) trained 964 classes. lcl fundamental idea applied alleviate parametric model's overfitting resulted lack training samples.",4 "future frame prediction anomaly detection -- new baseline. anomaly detection videos refers identification events conform expected behavior. however, almost existing methods tackle problem minimizing reconstruction errors training data, cannot guarantee larger reconstruction error abnormal event. paper, propose tackle anomaly detection problem within video prediction framework. best knowledge, first work leverages difference predicted future frame ground truth detect abnormal event. predict future frame higher quality normal events, commonly used appearance (spatial) constraints intensity gradient, also introduce motion (temporal) constraint video prediction enforcing optical flow predicted frames ground truth frames consistent, first work introduces temporal constraint video prediction task. spatial motion constraints facilitate future frame prediction normal events, consequently facilitate identify abnormal events conform expectation. extensive experiments toy dataset publicly available datasets validate effectiveness method terms robustness uncertainty normal events sensitivity abnormal events.",4 "fast support vector machines using parallel adaptive shrinking distributed systems. support vector machines (svm), popular machine learning technique, applied wide range domains science, finance, social networks supervised learning. whether identifying high-risk patients health-care professionals, potential high-school students enroll college school districts, svms play major role social good. paper undertakes challenge designing scalable parallel svm training algorithm large scale systems, includes commodity multi-core machines, tightly connected supercomputers cloud computing systems. intuitive techniques improving time-space complexity including adaptive elimination samples faster convergence sparse format representation proposed. sample elimination, several heuristics {\em earliest possible} {\em lazy} elimination non-contributing samples proposed. several cases, early sample elimination might result false positive, low overhead mechanisms reconstruction key data structures proposed. algorithm heuristics implemented evaluated various publicly available datasets. empirical evaluation shows 26x speed improvement datasets sequential baseline, evaluated multiple compute nodes, improvement execution time 30-60\% readily observed number datasets parallel baseline.",4 "nonparametric regression using deep neural networks relu activation function. consider multivariate nonparametric regression model. shown estimators based sparsely connected deep neural networks relu activation function properly chosen network architecture achieve minimax rates convergence (up log n-factors) general composition assumption regression function. framework includes many well-studied structural constraints (generalized) additive models. lot flexibility network architecture, tuning parameter sparsity network. specifically, consider large networks number potential parameters much bigger sample size. analysis gives insights multilayer feedforward neural networks perform well practice. interestingly, depth (number layers) neural network architectures plays important role theory suggests scaling network depth logarithm sample size natural.",12 "exploiting multi-layer graph factorization multi-attributed graph matching. multi-attributed graph matching problem finding correspondences two sets data considering complex properties described multiple attributes. however, information multiple attributes likely oversimplified process makes integrated attribute, degrades matching accuracy. reason, multi-layer graph structure-based algorithm proposed recently. effectively avoid problem separating attributes multiple layers. nonetheless, several remaining issues scalability problem caused huge matrix describe multi-layer structure back-projection problem caused continuous relaxation quadratic assignment problem. work, propose novel multi-attributed graph matching algorithm based multi-layer graph factorization. reformulate problem solved several small matrices obtained factorizing multi-layer structure. then, solve problem using convex-concave relaxation procedure multi-layer structure. proposed algorithm exhibits better performance state-of-the-art algorithms based single-layer structure.",4 "projected subgradient methods learning sparse gaussians. gaussian markov random fields (gmrfs) useful broad range applications. paper tackle problem learning sparse gmrf high-dimensional space. approach uses l1-norm regularization inverse covariance matrix. utilize novel projected gradient method, faster previous methods practice equal best performing asymptotic complexity. also extend l1-regularized objective problem sparsifying entire blocks within inverse covariance matrix. methods generalize fairly easily case, methods not. demonstrate extensions give better generalization performance two real domains--biological network analysis 2d-shape modeling image task.",4 "one 3-parameter model testing. article offers 3-parameter model testing, 1) difference ability level examinee item difficulty; 2) examinee discrimination 3) item discrimination model parameters.",4 "iterated tabu search algorithm packing unequal circles circle. paper presents iterated tabu search algorithm (denoted its-pucc) solving problem packing unequal circles circle. algorithm exploits continuous combinatorial nature unequal circles packing problem. uses continuous local optimization method generate locally optimal packings. meanwhile, builds neighborhood structure set local minimum via two appropriate perturbation moves integrates two combinatorial optimization methods, tabu search iterated local search, systematically search good local minima. computational experiments two sets widely-used test instances prove effectiveness efficiency. first set 46 instances coming famous circle packing contest second set 24 instances widely used literature, algorithm able discover respectively 14 16 better solutions previous best-known records.",12 "copa: constrained parafac2 sparse & large datasets. parafac2 demonstrated success modeling irregular tensors, tensor dimensions vary across one modes. example scenario jointly modeling treatments across set patients varying number medical encounters, alignment events time bears clinical meaning, may also impossible align due varying length. despite recent improvements scaling unconstrained parafac2, model factors usually dense sensitive noise limits interpretability. result, following open challenges remain: a) various modeling constraints, temporal smoothness, sparsity non-negativity, needed imposed interpretable temporal modeling b) scalable approach required support constraints efficiently large datasets. tackle challenges, propose constrained parafac2 (copa) method, carefully incorporates optimization constraints temporal smoothness, sparsity, non-negativity resulting factors. efficiently support constraints, copa adopts hybrid optimization framework using alternating optimization alternating direction method multiplier (ao-admm). evaluated large electronic health record (ehr) datasets hundreds thousands patients, copa achieves significant speedups (up 36x faster) prior parafac2 approaches attempt handle subset constraints copa enables. overall, method outperforms baselines attempting handle subset constraints terms speed, achieving level accuracy.",4 "towards effective codebookless model image classification. bag-of-features (bof) model image classification thoroughly studied last decade. different widely used bof methods modeled images pre-trained codebook, alternative codebook free image modeling method, call codebookless model (clm), attracted little attention. paper, present effective clm represents image single gaussian classification. embedding gaussian manifold vector space, show simple incorporation clm linear classifier achieves competitive accuracy compared state-of-the-art bof methods (e.g., fisher vector). since clm lies high dimensional riemannian manifold, propose joint learning method low-rank transformation support vector machine (svm) classifier gaussian manifold, order reduce computational storage cost. study alleviate side effect background clutter clm, also present simple yet effective partial background removal method based saliency detection. experiments extensively conducted eight widely used databases demonstrate effectiveness efficiency clm method.",4 "efficient sum outer products dictionary learning (soup-dil) - $\ell_0$ method. sparsity natural signals images transform domain dictionary extensively exploited several applications compression, denoising inverse problems. recently, data-driven adaptation synthesis dictionaries shown promise many applications compared fixed analytical dictionary models. however, dictionary learning problems typically non-convex np-hard, usual alternating minimization approaches problems often computationally expensive, computations dominated np-hard synthesis sparse coding step. work, investigate efficient method $\ell_{0}$ ""norm""-based dictionary learning first approximating training data set sum sparse rank-one matrices using block coordinate descent approach estimate unknowns. proposed block coordinate descent algorithm involves efficient closed-form solutions. particular, sparse coding step involves simple form thresholding. provide convergence analysis proposed block coordinate descent approach. numerical experiments show promising performance significant speed-ups provided method classical k-svd scheme sparse signal representation image denoising.",4 "neural recovery machine chinese dropped pronoun. dropped pronouns (dps) ubiquitous pro-drop languages like chinese, japanese etc. previous work mainly focused painstakingly exploring empirical features dps recovery. paper, propose neural recovery machine (nrm) model recover dps chinese, avoid non-trivial feature engineering process. experimental results show proposed nrm significantly outperforms state-of-the-art approaches two heterogeneous datasets. experiment results chinese zero pronoun (zp) resolution show performance zp resolution also improved recovering zps dps.",4 "ridi: robust imu double integration. paper proposes novel data-driven approach inertial navigation, learns estimate trajectories natural human motions inertial measurement unit (imu) every smartphone. key observation human motions repetitive consist major modes (e.g., standing, walking, turning). algorithm regresses velocity vector history linear accelerations angular velocities, corrects low-frequency bias linear accelerations, integrated twice estimate positions. acquired training data ground-truth motions across multiple human subjects multiple phone placements (e.g., bag hand). qualitatively quantitatively evaluations demonstrated algorithm surprisingly shown comparable results full visual inertial navigation. knowledge, paper first integrate sophisticated machine learning techniques inertial navigation, potentially opening new line research domain data-driven inertial navigation. publicly share code data facilitate research.",4 "discrete symbolic optimization boltzmann sampling continuous neural dynamics: gradient symbolic computation. gradient symbolic computation proposed means solving discrete global optimization problems using neurally plausible continuous stochastic dynamical system. gradient symbolic dynamics involves two free parameters must adjusted function time obtain global maximizer end computation. provide summary known gsc dynamics special cases settings parameters, also establish schedule two parameters convergence correct answer occurs high probability. results put empirical results already obtained gsc sound theoretical footing.",4 "synapse cap 2017 ner challenge: fasttext crf. present system cap 2017 ner challenge named entity recognition french tweets. system leverages unsupervised learning larger dataset french tweets learn features feeding crf model. ranked first without using gazetteer structured external data, f-measure 58.89\%. best knowledge, first system use fasttext embeddings (which include subword representations) embedding-based sentence representation ner.",4 "data mining actionable knowledge: survey. data mining process consists series steps ranging data cleaning, data selection transformation, pattern evaluation visualization. one central problems data mining make mined patterns knowledge actionable. here, term actionable refers mined patterns suggest concrete profitable actions decision-maker. is, user something bring direct benefits (increase profits, reduction cost, improvement efficiency, etc.) organization's advantage. however, written comprehensive survey available topic. goal paper fill void. paper, first present two frameworks mining actionable knowledge inexplicitly adopted existing research methods. try situate research topic two different viewpoints: 1) data mining tasks 2) adopted framework. finally, specify issues either addressed insufficiently studied yet conclude paper.",4 "opinion mining relating subjective expressions annual earnings us financial statements. financial statements contain quantitative information manager's subjective evaluation firm's financial status. using information released u.s. 10-k filings. qualitative quantitative appraisals crucial quality financial decisions. extract opinioned statements reports, built tagging models based conditional random field (crf) techniques, considering variety combinations linguistic factors including morphology, orthography, predicate-argument structure, syntax, simple semantics. results show crf models reasonably effective find opinion holders experiments adopted popular mpqa corpus training testing. contribution paper identify opinion patterns multiword expressions (mwes) forms rather single word forms. find managers corporations attempt use optimistic words obfuscate negative financial performance accentuate positive financial performance. results also show decreasing earnings often accompanied ambiguous mild statements reporting year increasing earnings stated assertive positive way.",4 "back basics: bayesian extensions irt outperform neural networks proficiency estimation. estimating student proficiency important task computer based learning systems. compare family irt-based proficiency estimation methods deep knowledge tracing (dkt), recently proposed recurrent neural network model promising initial results. evaluate well model predicts student's future response given previous responses using two publicly available one proprietary data set. find irt-based methods consistently matched outperformed dkt across data sets finest level content granularity tractable trained on. hierarchical extension irt captured item grouping structure performed best overall. data sets included non-trivial autocorrelations student response patterns, temporal extension irt improved performance standard irt rnn-based method not. conclude irt-based models provide simpler, better-performing alternative existing rnn-based models student interaction data also affording interpretability guarantees due formulation bayesian probabilistic models.",4 "improving universality learnability neural programmer-interpreters combinator abstraction. overcome limitations neural programmer-interpreters (npi) universality learnability, propose incorporation combinator abstraction neural programing new npi architecture support abstraction, call combinatory neural programmer-interpreter (cnpi). combinator abstraction dramatically reduces number complexity programs need interpreted core controller cnpi, still allowing cnpi represent interpret arbitrary complex programs collaboration core components. propose small set four combinators capture pervasive programming patterns. due finiteness simplicity combinator set offloading burden interpretation core, able construct cnpi universal respect set combinatorizable programs, adequate solving algorithmic tasks. moreover, besides supervised training execution traces, cnpi trained policy gradient reinforcement learning appropriately designed curricula.",4 "soft rule ensembles statistical learning. article supervised learning problems solved using soft rule ensembles. first review importance sampling learning ensembles (isle) approach useful generating hard rules. soft rules obtained logistic regression corresponding hard rules. order deal perfect separation problem related logistic regression, firth's bias corrected likelihood used. various examples simulation results show soft rule ensembles improve predictive performance hard rule ensembles.",19 "identifying dogmatism social media: signals models. explore linguistic behavioral features dogmatism social media construct statistical models identify dogmatic comments. model based corpus reddit posts, collected across diverse set conversational topics annotated via paid crowdsourcing. operationalize key aspects dogmatism described existing psychology theories (such over-confidence), finding predictive power. also find evidence new signals dogmatism, tendency dogmatic posts refrain signaling cognitive processes. use predictive model analyze millions reddit posts, find evidence suggests dogmatism deeper personality trait, present dogmatic users across many different domains, users engage dogmatic comments tend show increases dogmatic posts themselves.",4 "jointly modeling embedding translation bridge video language. automatically describing video content natural language fundamental challenge multimedia. recurrent neural networks (rnn), models sequence dynamics, attracted increasing attention visual interpretation. however, existing approaches generate word locally given previous words visual content, relationship sentence semantics visual content holistically exploited. result, generated sentences may contextually correct semantics (e.g., subjects, verbs objects) true. paper presents novel unified framework, named long short-term memory visual-semantic embedding (lstm-e), simultaneously explore learning lstm visual-semantic embedding. former aims locally maximize probability generating next word given previous words visual content, latter create visual-semantic embedding space enforcing relationship semantics entire sentence visual content. proposed lstm-e consists three components: 2-d and/or 3-d deep convolutional neural networks learning powerful video representation, deep rnn generating sentences, joint embedding model exploring relationships visual content sentence semantics. experiments youtube2text dataset show proposed lstm-e achieves to-date best reported performance generating natural sentences: 45.3% 31.0% terms bleu@4 meteor, respectively. also demonstrate lstm-e superior predicting subject-verb-object (svo) triplets several state-of-the-art techniques.",4 "risk agoras: dialectical argumentation scientific reasoning. propose formal framework intelligent systems reason scientific domains, particular carcinogenicity chemicals, study properties. framework grounded philosophy scientific enquiry discourse, uses model dialectical argumentation. formalism enables representation scientific uncertainty conflict manner suitable qualitative reasoning domain.",4 "multigrid neural architectures. propose multigrid extension convolutional neural networks (cnns). rather manipulating representations living single spatial grid, network layers operate across scale space, pyramid grids. consume multigrid inputs produce multigrid outputs; convolutional filters within-scale cross-scale extent. aspect distinct simple multiscale designs, process input different scales. viewed terms information flow, multigrid network passes messages across spatial pyramid. consequence, receptive field size grows exponentially depth, facilitating rapid integration context. critically, multigrid structure enables networks learn internal attention dynamic routing mechanisms, use accomplish tasks modern cnns fail. experiments demonstrate wide-ranging performance advantages multigrid. cifar imagenet classification tasks, flipping single grid multigrid within standard cnn paradigm improves accuracy, compute parameter efficient. multigrid independent architectural choices; show synergy combination residual connections. multigrid yields dramatic improvement synthetic semantic segmentation dataset. strikingly, relatively shallow multigrid networks learn directly perform spatial transformation tasks, where, contrast, current cnns fail. together, results suggest continuous evolution features multigrid pyramid powerful alternative existing cnn designs flat grid.",4 "toward integrated framework automated development optimization online advertising campaigns. creating monitoring competitive cost-effective pay-per-click advertisement campaigns web-search channel resource demanding task terms expertise effort. assisting even automating work advertising specialist unrivaled commercial value. paper propose methodology, architecture, fully functional framework semi- fully- automated creation, monitoring, optimization cost-efficient pay-per-click campaigns budget constraints. campaign creation module generates automatically keywords based content web page advertised extended corresponding ad-texts. keywords used create automatically campaigns fully equipped appropriate values set. campaigns uploaded auctioneer platform start running. optimization module focuses learning process existing campaign statistics also applied strategies previous periods order invest optimally next period. objective maximize performance (i.e. clicks, actions) current budget constraint. fully functional prototype experimentally evaluated real world google adwords campaigns presents promising behavior regards campaign performance statistics outperforms systematically competing manually maintained campaigns.",4 "enabling factor analysis thousand-subject neuroimaging datasets. scale functional magnetic resonance image data rapidly increasing large multi-subject datasets becoming widely available high-resolution scanners adopted. inherent low-dimensionality information data led neuroscientists consider factor analysis methods extract analyze underlying brain activity. work, consider two recent multi-subject factor analysis methods: shared response model hierarchical topographic factor analysis. perform analytical, algorithmic, code optimization enable multi-node parallel implementations scale. single-node improvements result 99x 1812x speedups two methods, enables processing larger datasets. distributed implementations show strong scaling 3.3x 5.5x respectively 20 nodes real datasets. also demonstrate weak scaling synthetic dataset 1024 subjects, 1024 nodes 32,768 cores.",19 "meta-prod2vec - product embeddings using side-information recommendation. propose meta-prod2vec, novel method compute item similarities recommendation leverages existing item metadata. scenarios frequently encountered applications content recommendation, ad targeting web search. method leverages past user interactions items attributes compute low-dimensional embeddings items. specifically, item metadata in- jected model side information regularize item embeddings. show new item representa- tions lead better performance recommendation tasks open music dataset.",4 "automatic curation golf highlights using multimodal excitement features. production sports highlight packages summarizing game's exciting moments essential task broadcast media. yet, requires labor-intensive video editing. propose novel approach auto-curating sports highlights, use create real-world system editorial aid golf highlight reels. method fuses information players' reactions (action recognition high-fives fist pumps), spectators (crowd cheering), commentator (tone voice word analysis) determine interesting moments game. accurately identify start end frames key shot highlights additional metadata, player's name hole number, allowing personalized content summarization retrieval. addition, introduce new techniques learning classifiers reduced manual training data annotation exploiting correlation different modalities. work demonstrated major golf tournament, successfully extracting highlights live video streams four consecutive days.",4 "latent semantics action verbs reflect phonetic parameters intensity emotional content. conjuring thoughts, language reflects statistical patterns word co-occurrences turn come describe perceive world. whether counting frequently nouns verbs combine google search queries, extracting eigenvectors term document matrices made wikipedia lines shakespeare plots, resulting latent semantics capture associative links form concepts, also spatial dimensions embedded within surface structure language. shape movements objects found associated phonetic contrasts already toddlers, study explores whether articulatory acoustic parameters may likewise differentiate latent semantics action verbs. selecting 3 x 20 emotion, face, hand related verbs known activate premotor areas brain, mutual cosine similarities computed using latent semantic analysis lsa, resulting adjacency matrices compared based two different large scale text corpora; hawik tasa. applying hierarchical clustering identify common structures across two text corpora, verbs largely divide combined mouth hand movements versus emotional expressions. transforming verbs constituent phonemes, clustered small large size movements appear differentiated front versus back vowels corresponding increasing levels arousal. whereas clustered emotional verbs seem characterized sequences close versus open jaw produced phonemes, generating up- downwards shifts formant frequencies may influence perceived valence. suggesting, latent semantics action verbs reflect parameters intensity emotional polarity appear correlated articulatory contrasts acoustic characteristics phonemes",4 "path planning kinematic constraints robot groups. path planning multiple robots well studied ai robotics communities. given discretized environment, robots need find collision-free paths set specified goal locations. robots fully anonymous, non-anonymous, organized groups. although powerful solvers abstract problem exist, make simplifying assumptions ignoring kinematic constraints, making difficult use resulting plans actual robots. paper, present solution takes kinematic constraints, maximum velocities, account, guaranteeing user-specified minimum safety distance robots. demonstrate approach simulation real robots 2d 3d environments.",4 "nonparametric inference auto-encoding variational bayes. would like learn latent representations low-dimensional highly interpretable. model characteristics gaussian process latent variable model. benefits negative gp-lvm complementary variational autoencoder, former provides interpretable low-dimensional latent representations latter able handle large amounts data use non-gaussian likelihoods. inspiration paper marry two approaches reap benefits both. order introduce novel approximate inference scheme inspired gp-lvm vae. show experimentally approximation allows capacity generative bottle-neck (z) vae arbitrarily large without losing highly interpretable representation, allowing reconstruction quality unlimited z time low-dimensional space used perform ancestral sampling well means reason embedded data.",19 "biometric authorization system using gait biometry. human gait, new biometric aimed recognize individuals way walk come play increasingly important role visual surveillance applications. paper novel hybrid holistic approach proposed show behavioural walking characteristics used recognize unauthorized suspicious persons enter surveillance area. initially background modelled input video captured cameras deployed security foreground moving object individual frames segmented using background subtraction algorithm. gait representing spatial, temporal wavelet components extracted fused training testing multi class support vector machine models (svm). proposed system evaluated using side view videos nlpr database. experimental results demonstrate proposed system achieves pleasing recognition rate also results indicate classification ability svm radial basis function (rbf) better kernel functions.",4 "active ranking pairwise comparisons parametric assumptions help. consider sequential active ranking set n items based noisy pairwise comparisons. items ranked according probability given item beats randomly chosen item, ranking refers partitioning items sets pre-specified sizes according scores. notion ranking includes special cases identification top-k items total ordering items. first analyze sequential ranking algorithm counts number comparisons won, uses counts decide whether stop, compare another pair items, chosen based confidence intervals specified data collected point. prove algorithm succeeds recovering ranking using number comparisons optimal logarithmic factors. guarantee require structural properties underlying pairwise probability matrix, unlike significant body past work pairwise ranking based parametric models thurstone bradley-terry-luce models. long-standing open question whether imposing parametric assumptions allows improved ranking algorithms. stochastic comparison models, pairwise probabilities bounded away zero, second contribution resolve issue proving lower bound parametric models. shows, perhaps surprisingly, popular parametric modeling choices offer logarithmic gains stochastic comparisons.",4 "knowledge management economic intelligence reasoning temporal attributes. people make important decisions within time frame. hence, imperative employ means strategy aid effective decision making. consequently, economic intelligence (ei) emerged field aid strategic timely decision making organization. course attaining goal: indispensable optimistic towards provision conservation intellectual resource invested process decision making. intellectual resource nothing else knowledge actors well various processes effecting decision making. knowledge recognized strategic economic resource enhancing productivity key innovation organization community. thus, adequate management cognizance temporal properties highly indispensable. temporal properties knowledge refer date time (known timestamp) knowledge created well duration interval related knowledge. paper focuses needs user-centered knowledge management approach well exploitation associated temporal properties. perspective knowledge respect decision-problems projects ei. hypothesis possibility reasoning temporal properties exploitation knowledge ei projects foster timely decision making generation useful inferences available reusable knowledge new project.",4 "ground truth bias external cluster validity indices. noticed external cvis exhibit preferential bias towards larger smaller number clusters monotonic (directly inversely) number clusters candidate partitions. type bias caused functional form cvi model. example, popular rand index (ri) exhibits monotone increasing (ncinc) bias, jaccard index (ji) index suffers monotone decreasing (ncdec) bias. type bias previously recognized literature. work, identify new type bias arising distribution ground truth (reference) partition candidate partitions compared. call new type bias ground truth (gt) bias. type bias occurs change reference partition causes change bias status (e.g., ncinc, ncdec) cvi. example, ncinc bias ri changed ncdec bias skewing distribution clusters ground truth partition. important users aware new type biased behaviour, since may affect interpretations cvi results. objective article study empirical theoretical implications gt bias. best knowledge, first extensive study property external cluster validity indices.",19 "analyzing language development network approach. paper propose new measures language development using network analyses, inspired recent surge interests network studies many real-world systems. children's care-takers' speech data longitudinal study represented series networks, word forms taken nodes collocation words links. measures properties networks, size, connectivity, hub authority analyses, etc., allow us make quantitative comparison reveal different paths development. example, asynchrony development network size average degree suggests children cannot simply classified early talkers late talkers one two measures. children follow different paths multi-dimensional space. may develop faster one dimension slower another dimension. network approach requires little preprocessing words analyses sentence structures, characteristics words usage emerge network independent grammatical presumptions. show change two articles ""the"" ""a"" roles important nodes network reflects progress children's syntactic development: two articles often start children's networks hubs later shift authorities, authorities constantly adult's networks. network analyses provide new approach study language development, time language development also presents rich area network theories explore.",4 "training spiking neural networks based information theoretic costs. spiking neural network type artificial neural network neurons communicate spikes. spikes identical boolean events characterized time arrival. spiking neuron internal dynamics responds history inputs opposed current inputs only. properties spiking neural network rich intrinsic capabilities process spatiotemporal data. however, spikes discontinuous 'yes no' events, trivial apply traditional training procedures gradient descend spiking neurons. thesis propose use stochastic spiking neuron models probability spiking output continuous function parameters. formulate several learning tasks minimization certain information-theoretic cost functions use spiking output probability distributions. develop generalized description stochastic spiking neuron new spiking neuron model allows flexibly process rich spatiotemporal data. formulate derive learning rules following tasks: - supervised learning task detecting spatiotemporal pattern minimization negative log-likelihood (the surprisal) neuron's output - unsupervised learning task increasing stability neurons output minimization entropy - reinforcement learning task controlling agent modulated optimization filtered surprisal neuron's output. test derived learning rules several experiments spatiotemporal pattern detection, spatiotemporal data storing recall autoassociative memory, combination supervised unsupervised learning speed learning process, adaptive control simple virtual agents changing environments.",4 "indian sign language recognition using eigen value weighted euclidean distance based classification technique. sign language recognition one growing fields research today. many new techniques developed recently fields. paper, proposed system using eigen value weighted euclidean distance classification technique recognition various sign languages india. system comprises four parts: skin filtering, hand cropping, feature extraction classification. twenty four signs considered paper, ten samples, thus total two hundred forty images considered recognition rate obtained 97 percent.",4 "biologically inspired protection deep networks adversarial attacks. inspired biophysical principles underlying nonlinear dendritic computation neural circuits, develop scheme train deep neural networks make robust adversarial attacks. scheme generates highly nonlinear, saturated neural networks achieve state art performance gradient based adversarial examples mnist, despite never exposed adversarially chosen examples training. moreover, networks exhibit unprecedented robustness targeted, iterative schemes generating adversarial examples, including second-order methods. identify principles governing networks achieve robustness, drawing methods information geometry. find networks progressively create highly flat compressed internal representations sensitive input dimensions, still solving task. moreover, employ highly kurtotic weight distributions, also found brain, demonstrate kurtosis protect even linear classifiers adversarial attack.",19 "unsupervised context-sensitive spelling correction english dutch clinical free-text word character n-gram embeddings. present unsupervised context-sensitive spelling correction method clinical free-text uses word character n-gram embeddings. method generates misspelling replacement candidates ranks according semantic fit, calculating weighted cosine similarity vectorized representation candidate misspelling context. tune parameters model, generate self-induced spelling error corpora. perform experiments two languages. english, greatly outperform off-the-shelf spelling correction tools manually annotated mimic-iii test set, counter frequency bias noisy channel model, showing neural embeddings successfully exploited improve upon state-of-the-art. dutch, also outperform off-the-shelf spelling correction tool manually annotated clinical records antwerp university hospital, offer empirical evidence method counters frequency bias noisy channel model case well. however, context-sensitive model implementation noisy channel model obtain high scores test set, establishing state-of-the-art dutch clinical spelling correction noisy channel model.",4 "automatic calcium scoring low-dose chest ct using deep neural networks dilated convolutions. heavy smokers undergoing screening low-dose chest ct affected cardiovascular disease much lung cancer. low-dose chest ct scans acquired screening enable quantification atherosclerotic calcifications thus enable identification subjects increased cardiovascular risk. paper presents method automatic detection coronary artery, thoracic aorta cardiac valve calcifications low-dose chest ct using two consecutive convolutional neural networks. first network identifies labels potential calcifications according anatomical location second network identifies true calcifications among detected candidates. method trained evaluated set 1744 ct scans national lung screening trial. determine whether reconstruction images reconstructed soft tissue filters used calcification detection, evaluated method soft medium/sharp filter reconstructions separately. soft filter reconstructions, method achieved f1 scores 0.89, 0.89, 0.67, 0.55 coronary artery, thoracic aorta, aortic valve mitral valve calcifications, respectively. sharp filter reconstructions, f1 scores 0.84, 0.81, 0.64, 0.66, respectively. linearly weighted kappa coefficients risk category assignment based per subject coronary artery calcium 0.91 0.90 soft sharp filter reconstructions, respectively. results demonstrate presented method enables reliable automatic cardiovascular risk assessment low-dose chest ct scans acquired lung cancer screening.",4 "thing tried work well : deictic representation reinforcement learning. reinforcement learning methods operate propositional representations world state. representations often intractably large generalize poorly. using deictic representation believed viable alternative: promise generalization allowing use existing reinforcement-learning methods. yet, experiments learning deictic representations reported literature. paper explore effectiveness two forms deictic representation na\""{i}ve propositional representation simple blocks-world domain. find, empirically, deictic representations actually worsen learning performance. conclude discussion possible causes results strategies effective learning domains objects.",4 "parsimonious topic models salient word discovery. propose parsimonious topic model text corpora. related models latent dirichlet allocation (lda), words modeled topic-specifically, even though many words occur similar frequencies across different topics. modeling determines salient words topic, topic-specific probabilities, rest explained universal shared model. further, lda topics principle present every document. contrast model gives sparse topic representation, determining (small) subset relevant topics document. derive bayesian information criterion (bic), balancing model complexity goodness fit. here, interestingly, identify effective sample size corresponding penalty specific parameter type model. minimize bic jointly determine entire model -- topic-specific words, document-specific topics, model parameter values, {\it and} total number topics -- wholly unsupervised fashion. results three text corpora image dataset show model achieves higher test set likelihood better agreement ground-truth class labels, compared lda model designed incorporate sparsity.",4 "exploiting causal independence bayesian network inference. new method proposed exploiting causal independencies exact bayesian network inference. bayesian network viewed representing factorization joint probability multiplication set conditional probabilities. present notion causal independence enables one factorize conditional probabilities combination even smaller factors consequently obtain finer-grain factorization joint probability. new formulation causal independence lets us specify conditional probability variable given parents terms associative commutative operator, ``or'', ``sum'' ``max'', contribution parent. start simple algorithm bayesian network inference that, given evidence query variable, uses factorization find posterior distribution query. show algorithm extended exploit causal independence. empirical studies, based cpcs networks medical diagnosis, show method efficient previous methods allows inference larger networks previous algorithms.",4 "bat algorithm better intermittent search strategy. efficiency metaheuristic algorithm largely depends way balancing local intensive exploitation global diverse exploration. studies show bat algorithm provide good balance two key components superior efficiency. paper, first review commonly used metaheuristic algorithms, compare performance bat algorithm so-called intermittent search strategy. simulations, found bat algorithm better optimal intermittent search strategy. also analyse comparison results implications higher dimensional optimization problems. addition, also apply bat algorithm solving business optimization engineering design problems.",12 "multi-engine approach answer set programming. answer set programming (asp) truly-declarative programming paradigm proposed area non-monotonic reasoning logic programming, recently employed many applications. development efficient asp systems is, thus, crucial. mind task improving solving methods asp, two usual ways reach goal: $(i)$ extending state-of-the-art techniques asp solvers, $(ii)$ designing new asp solver scratch. alternative trends build top state-of-the-art solvers, apply machine learning techniques choosing automatically ""best"" available solver per-instance basis. paper pursue latter direction. first define set cheap-to-compute syntactic features characterize several aspects asp programs. then, apply classification methods that, given features instances {\sl training} set solvers' performance instances, inductively learn algorithm selection strategies applied {\sl test} set. report results number experiments considering solvers different training test sets instances taken ones submitted ""system track"" 3rd asp competition. analysis shows that, applying machine learning techniques asp solving, possible obtain robust performance: approach solve instances compared solver entered 3rd asp competition. (to appear theory practice logic programming (tplp).)",4 "centroid-based summarization multiple documents: sentence extraction, utility-based evaluation, user studies. present multi-document summarizer, called mead, generates summaries using cluster centroids produced topic detection tracking system. also describe two new techniques, based sentence utility subsumption, applied evaluation single multiple document summaries. finally, describe two user studies test models multi-document summarization.",4 "fast deep learning model textual relevance biomedical information retrieval. publications life sciences characterized large technical vocabulary, many lexical semantic variations expressing concept. towards addressing problem relevance biomedical literature search, introduce deep learning model relevance document's text keyword style query. limited relatively small amount training data, model uses pre-trained word embeddings. these, model first computes variable-length delta matrix query document, representing difference two texts, passed deep convolution stage followed deep feed-forward network compute relevance score. results fast model suitable use online search engine. model robust outperforms comparable state-of-the-art deep learning approaches.",4 "deep residual bidir-lstm human activity recognition using wearable sensors. human activity recognition (har) become popular topic research wide application. development deep learning, new ideas appeared address har problems. here, deep network architecture using residual bidirectional long short-term memory (lstm) cells proposed. advantages new network include bidirectional connection concatenate positive time direction (forward state) negative time direction (backward state). second, residual connections stacked cells act highways gradients, pass underlying information directly upper layer, effectively avoiding gradient vanishing problem. generally, proposed network shows improvements temporal (using bidirectional cells) spatial (residual connections stacked deeply) dimensions, aiming enhance recognition rate. tested opportunity data set public domain uci data set, accuracy increased 4.78% 3.68%, respectively, compared previously reported results. finally, confusion matrix public domain uci data set analyzed.",4 "multi-label image recognition recurrently discovering attentional regions. paper proposes novel deep architecture address multi-label image recognition, fundamental practical task towards general visual understanding. current solutions task usually rely extra step extracting hypothesis regions (i.e., region proposals), resulting redundant computation sub-optimal performance. work, achieve interpretable contextualized multi-label image classification developing recurrent memorized-attention module. module consists two alternately performed components: i) spatial transformer layer locate attentional regions convolutional feature maps region-proposal-free way ii) lstm (long-short term memory) sub-network sequentially predict semantic labeling scores located regions capturing global dependencies regions. lstm also output parameters computing spatial transformer. large-scale benchmarks multi-label image classification (e.g., ms-coco pascal voc 07), approach demonstrates superior performances existing state-of-the-arts accuracy efficiency.",4 "adversarial dropout supervised semi-supervised learning. recently, training adversarial examples, generated adding small worst-case perturbation input examples, proved improve generalization performance neural networks. contrast individually biased inputs enhance generality, paper introduces adversarial dropout, minimal set dropouts maximize divergence outputs network dropouts training supervisions. identified adversarial dropout used reconfigure neural network train, demonstrated training reconfigured sub-network improves generalization performance supervised semi-supervised learning tasks mnist cifar-10. analyzed trained model reason performance improvement, found adversarial dropout increases sparsity neural networks standard dropout does.",4 "linguistics aspects underlying dynamics. recent years, central components new approach linguistics, minimalist program (mp) come closer physics. features minimalist program, unconstrained nature recursive merge, operation labeling algorithm operates interface narrow syntax conceptual-intentional sensory-motor interfaces, difference pronounced un-pronounced copies elements sentence build-up fibonacci sequence syntactic derivation sentence structures, directly accessible representation terms algebraic formalism. although scheme linguistic structures classical ones, find interesting productive isomorphism established mp structure, algebraic structures many-body field theory opening new avenues inquiry dynamics underlying central aspects linguistics.",4 "deepapt: nation-state apt attribution using end-to-end deep neural networks. recent years numerous advanced malware, aka advanced persistent threats (apt) allegedly developed nation-states. task attributing apt specific nation-state extremely challenging several reasons. nation-state usually single cyber unit develops advanced malware, rendering traditional authorship attribution algorithms useless. furthermore, apts use state-of-the-art evasion techniques, making feature extraction challenging. finally, dataset available apts extremely small. paper describe deep neural networks (dnn) could successfully employed nation-state apt attribution. use sandbox reports (recording behavior apt run dynamically) raw input neural network, allowing dnn learn high level feature abstractions apts itself. using test set 1,000 chinese russian developed apts, achieved accuracy rate 94.6%.",4 "deep unsupervised intrinsic image decomposition siamese training. harness modern intrinsic decomposition tools based deep learning increase applicability realworld use cases. traditional techniques derived retinex theory: handmade prior assumptions constrain optimization yield unique solution qualitatively satisfying limited set examples. modern techniques based supervised deep learning leverage largescale databases usually synthetic sparsely annotated. decomposition quality images wild therefore arguable. propose end-to-end deep learning solution trained without ground truth supervision, hard obtain. time-lapses form ubiquitous source data (under scene staticity assumption) capture constant albedo varying shading conditions. exploit natural relationship train unsupervised siamese manner image pairs. yet, trained network applies single images inference time. present new dataset demonstrate siamese training on, reach results compete state art, despite unsupervised nature training scheme. evaluation difficult, rely extensive experiments analyze strengths weaknesses related methods.",4 "cascaded region-based densely connected network event detection: seismic application. automatic event detection time series signals wide applications, abnormal event detection video surveillance event detection geophysical data. traditional detection methods detect events primarily use similarity correlation data. methods inefficient yield low accuracy. recent years, significantly increased computational power, machine learning techniques revolutionized many science engineering domains. study, apply deep-learning-based method detection events time series seismic signals. however, direct adaptation similar ideas 2d object detection problem faces two challenges. first challenge duration earthquake event varies significantly; proposals generated temporally correlated. address challenges, propose novel cascaded region-based convolutional neural network capture earthquake events different sizes, incorporating contextual information enrich features individual proposal. achieve better generalization performance, use densely connected blocks backbone network. fact positive events correctly annotated, formulate detection problem learning-from-noise problem. verify performance detection methods, employ methods seismic data generated bi-axial ""earthquake machine"" located rock mechanics laboratory, acquire labels help experts. numerical tests, show novel detection techniques yield high accuracy. therefore, novel deep-learning-based detection methods potentially powerful tools locating events time series data various applications.",4 "possible similarity gene semantic networks. several domains linguistics, molecular biology social sciences, holistic effects hardly well-defined modeling single units, studies tend understand macro structures help meaningful useful associations fields social networks, systems biology semantic web. stochastic multi-agent system offers accurate theoretical framework operational computing implementations model large-scale associations, dynamics patterns extraction. show clustering around target object set associations object prove similarity specific data two case studies gene-gene term-term relationships leading idea common organizing principle cognition random deterministic effects.",4 "generic multiplicative methods implementing machine learning algorithms mapreduce. paper introduce generic model multiplicative algorithms suitable mapreduce parallel programming paradigm. implement three typical machine learning algorithms demonstrate similarity comparison, gradient descent, power method classic learning techniques fit model well. two versions large-scale matrix multiplication discussed paper, different methods developed cases regard unique computational characteristics problem settings. contrast earlier research, focus fundamental linear algebra techniques establish generic approach range algorithms, rather specific ways scaling algorithms one time. experiments show promising results evaluated speedup accuracy. compared standard implementation computational complexity $o(m^3)$ worst case, large-scale matrix multiplication experiments prove design considerably efficient maintains good speedup number cores increases. algorithm-specific experiments also produce encouraging results runtime performance.",4 "spatially constrained location prior scene parsing. semantic context important useful cue scene parsing complicated natural images substantial amount variations objects environment. paper proposes spatially constrained location prior (sclp) effective modelling global local semantic context scene terms inter-class spatial relationships. unlike existing studies focusing either relative absolute location prior objects, sclp effectively incorporates relative absolute location priors calculating object co-occurrence frequencies spatially constrained image blocks. sclp general used conjunction various visual feature-based prediction models, artificial neural networks support vector machine (svm), enforce spatial contextual constraints class labels. using svm classifiers linear regression model, demonstrate incorporation sclp achieves superior performance compared state-of-the-art methods stanford background sift flow datasets.",4 "power joint wavelet-dct features multispectral palmprint recognition. biometric-based identification drawn lot attention recent years. among biometrics, palmprint known possess rich set features. paper proposed use dct-based features parallel wavelet-based ones palmprint identification. pca applied features reduce dimensionality majority voting algorithm used perform classification. features introduced result near-perfectly accurate identification. method tested well-known multispectral palmprint database accuracy rate 99.97-100\% achieved, outperforming previous methods similar conditions.",4 "multiphase image segmentation based fuzzy membership functions l1-norm fidelity. paper, propose variational multiphase image segmentation model based fuzzy membership functions l1-norm fidelity. apply alternating direction method multipliers solve equivalent problem. subproblems solved efficiently. specifically, propose fast method calculate fuzzy median. experimental results comparisons show l1-norm based method robust outliers impulse noise keeps better contrast l2-norm counterpart. theoretically, prove existence minimizer analyze convergence algorithm.",12 "chinese dataset negative full forms general abbreviation prediction. abbreviation common phenomenon across languages, especially chinese. cases, expression abbreviated, abbreviation used often fully expanded forms, since people tend convey information concise way. various language processing tasks, abbreviation obstacle improving performance, textual form abbreviation express useful information, unless expanded full form. abbreviation prediction means associating fully expanded forms abbreviations. however, due deficiency abbreviation corpora, task limited current studies, especially considering general abbreviation prediction also include full form expressions valid abbreviations, namely negative full forms (nffs). corpora incorporating negative full forms general abbreviation prediction number. order promote research area, build dataset general chinese abbreviation prediction, needs preprocessing steps, evaluate several different models built dataset. dataset available https://github.com/lancopku/chinese-abbreviation-dataset",4 "self-supervised learning motion capture. current state-of-the-art solutions motion capture single camera optimization driven: optimize parameters 3d human model re-projection matches measurements video (e.g. person segmentation, optical flow, keypoint detections etc.). optimization models susceptible local minima. bottleneck forced using clean green-screen like backgrounds capture time, manual initialization, switching multiple cameras input resource. work, propose learning based motion capture model single camera input. instead optimizing mesh skeleton parameters directly, model optimizes neural network weights predict 3d shape skeleton configurations given monocular rgb video. model trained using combination strong supervision synthetic data, self-supervision differentiable rendering (a) skeletal keypoints, (b) dense 3d mesh motion, (c) human-background segmentation, end-to-end framework. empirically show model combines best worlds supervised learning test-time optimization: supervised learning initializes model parameters right regime, ensuring good pose surface initialization test time, without manual effort. self-supervision back-propagating differentiable rendering allows (unsupervised) adaptation model test data, offers much tighter fit pretrained fixed model. show proposed model improves experience converges low-error solutions previous optimization methods fail.",4 "probabilistic prototype models attributed graphs. contribution proposes new approach towards developing class probabilistic methods classifying attributed graphs. key concept random attributed graph, defined attributed graph whose nodes edges annotated random variables. every node/edge two random processes associated it- occurence probability probability distribution attribute values. estimated within maximum likelihood framework. likelihood random attributed graph generate outcome graph used feature classification. proposed approach fast robust noise.",4 "simple hierarchical pooling data structure loop closure. propose data structure obtained hierarchically averaging bag-of-word descriptors sequence views achieves average speedups large-scale loop closure applications ranging 4 20 times benchmark datasets. although simple, method works well sophisticated agglomerative schemes fraction cost minimal loss performance.",4 "artificial agents speculative bubbles. pertaining agent-based computational economics (ace), work presents two models rise downfall speculative bubbles exchange price fixing based double auction mechanisms. first model based finite time horizon context, expected dividends decrease along time. second model follows {\em greater fool} hypothesis; agent behaviour depends comparison estimated risk greater fool's. simulations shed light influent parameters necessary conditions apparition speculative bubbles asset market within considered framework.",4 "unsupervised learning disentangled interpretable representations sequential data. present factorized hierarchical variational autoencoder, learns disentangled interpretable representations sequential data without supervision. specifically, exploit multi-scale nature information sequential data formulating explicitly within factorized hierarchical graphical model imposes sequence-dependent priors sequence-independent priors different sets latent variables. model evaluated two speech corpora demonstrate, qualitatively, ability transform speakers linguistic content manipulating different sets latent variables; quantitatively, ability outperform i-vector baseline speaker verification reduce word error rate much 35% mismatched train/test scenarios automatic speech recognition tasks.",4 "predictive linear-gaussian models stochastic dynamical systems. models dynamical systems based predictive state representations (psrs) defined strictly terms observable quantities, contrast traditional models (such hidden markov models) use latent variables statespace representations. addition, psrs effectively infinite memory, allowing model systems finite memory-based models cannot. thus far, psr models primarily developed domains discrete observations. here, develop predictive linear-gaussian (plg) model, class psr models domains continuous observations. show plg models subsume linear dynamical system models (also called kalman filter models state-space models) using fewer parameters. also introduce algorithm estimate plg parameters data, contrast standard expectation maximization (em) algorithms used estimate kalman filter parameters. show algorithm consistent estimation procedure present preliminary empirical results suggesting algorithm outperforms em, particularly model dimension increases.",4 "boolean matrix factorization noisy completion via message passing. boolean matrix factorization boolean matrix completion noisy observations desirable unsupervised data-analysis methods due interpretability, hard perform due np-hardness. treat problems maximum posteriori inference problems graphical model present message passing approach scales linearly number observations factors. empirical study demonstrates message passing able recover low-rank boolean matrices, boundaries theoretically possible recovery compares favorably state-of-the-art real-world applications, collaborative filtering large-scale boolean data.",12 "accurate tests statistical significance result differences. statistical significance testing differences values metrics like recall, precision balanced f-score necessary part empirical natural language processing. unfortunately, find set experiments many commonly used tests often underestimate significance less likely detect differences exist different techniques. underestimation comes independence assumption often violated. point useful tests make assumption, including computationally-intensive randomization tests.",4 "core kernels. term ""core kernel"" stands correlation-resemblance kernel. many applications (e.g., vision), data often high-dimensional, sparse, non-binary. propose two types (nonlinear) core kernels non-binary sparse data demonstrate effectiveness new kernels classification experiment. core kernels simple tuning parameters. however, training nonlinear kernel svm (very) costly time memory may suitable truly large-scale industrial applications (e.g. search). order make proposed core kernels practical, develop basic probabilistic hashing algorithms transform nonlinear kernels linear kernels.",19 "belief propagation linear programming. belief propagation (bp) popular, distributed heuristic performing map computations graphical models. bp interpreted, variational perspective, minimizing bethe free energy (bfe). bp also used solve special class linear programming (lp) problems. class problems, map inference stated integer lp lp relaxation coincides minimization bfe ``zero temperature"". generalize prior results establish tight characterization lp problems formulated equivalent lp relaxation map inference. moreover, suggest efficient, iterative annealing bp algorithm solving broader class lp problems. demonstrate algorithm's performance set weighted matching problems using cutting plane method solve sequence lps tightened adding ``blossom'' inequalities.",4 "nearly optimal robust subspace tracking dynamic robust pca. study robust subspace tracking (rst) problem obtain one first provable guarantees it. goal rst track data lies slowly changing low-dimensional subspace, subspaces themselves, robust corruption (often large magnitude) sparse outliers. simply interpreted dynamic (time-varying) extension robust pca, minor difference rst also requires online algorithm (short tracking delay). propose algorithm called norst (nearly optimal rst) prove solves rst dynamic robust pca weakened versions standard rpca assumptions, slow subspace change, two simple extra assumptions (a lower bound outlier magnitudes, independence columns low-rank matrix). guarantee shows norst enjoys near optimal tracking delay $o(r \log n \log(1/\varepsilon))$. required delay subspace change times same, memory complexity $n$ times value. $n$ ambient space dimension $r$ dimension changing subspaces true data lies. thus also nearly optimal. finally, guarantee also shows norst best outlier tolerance compared previous rpca rst methods, theoretically empirically (including real videos), without requiring model outlier support sets.",4 "using wikipedia boost svd recommender systems. singular value decomposition (svd) used successfully recent years area recommender systems. paper present model extended consider user ratings information wikipedia. mapping items wikipedia pages quantifying similarity, able use information order improve recommendation accuracy, especially sparsity high. another advantage proposed approach fact easily integrated svd implementation, regardless additional parameters may added it. preliminary experimental results movielens dataset encouraging.",4 "another perspective default reasoning. lexicographic closure given finite set normal defaults defined. conditional assertion ""if b"" lexicographic closure if, given defaults fact a, one would conclude b. lexicographic closure essentially rational extension d, rational closure, defined previous paper. provides logic normal defaults different one proposed r. reiter rich enough require consideration non-normal defaults. large number examples provided show lexicographic closure corresponds basic intuitions behind reiter's logic defaults.",4 "iteration complexity randomized block-coordinate descent methods minimizing composite function. paper develop randomized block-coordinate descent method minimizing sum smooth simple nonsmooth block-separable convex function prove obtains $\epsilon$-accurate solution probability least $1-\rho$ $o(\tfrac{n}{\epsilon} \log \tfrac{1}{\rho})$ iterations, $n$ number blocks. strongly convex functions method converges linearly. extends recent results nesterov [efficiency coordinate descent methods huge-scale optimization problems, core discussion paper #2010/2], cover smooth case, composite minimization, time improving complexity factor 4 removing $\epsilon$ logarithmic term. importantly, contrast aforementioned work author achieves results applying method regularized version objective function unknown scaling factor, show necessary, thus achieving true iteration complexity bounds. smooth case also allow arbitrary probability vectors non-euclidean norms. finally, demonstrate numerically algorithm able solve huge-scale $\ell_1$-regularized least squares support vector machine problems billion variables.",12 "online learning-based framework tracking. study tracking problem, namely, estimating hidden state object time, unreliable noisy measurements. standard framework tracking problem generative framework, basis solutions bayesian algorithm approximation, particle filters. however, solutions sensitive model mismatches. paper, motivated online learning, introduce new framework tracking. provide efficient tracking algorithm framework. provide experimental results comparing algorithm bayesian algorithm simulated data. experiments show slight model mismatches, algorithm outperforms bayesian algorithm.",4 "conflict-driven asp solving external sources. answer set programming (asp) well-known problem solving approach based nonmonotonic logic programs efficient solvers. enable access external information, hex-programs extend programs external atoms, allow bidirectional communication logic program external sources computation (e.g., description logic reasoners web resources). current solvers evaluate hex-programs translation asp itself, values external atoms guessed verified ordinary answer set computation. elegant approach scale number external accesses general, particular presence nondeterminism (which instrumental asp). paper, present novel, native algorithm evaluating hex-programs uses learning techniques. particular, extend conflict-driven asp solving techniques, prevent solver running conflict again, ordinary hex-programs. show gain additional knowledge external source evaluations use conflict-driven algorithm. first target uninformed case, i.e., extra information external sources, extend approach case additional meta-information available. experiments show learning external sources significantly decrease runtime number considered candidate compatible sets.",4 "audio-replay attack detection countermeasures. paper presents speech technology center (stc) replay attack detection systems proposed automatic speaker verification spoofing countermeasures challenge 2017. study focused comparison different spoofing detection approaches. gmm based methods, high level features extraction simple classifier deep learning frameworks. experiments performed development evaluation parts challenge dataset demonstrated stable efficiency deep learning approaches case changing acoustic conditions. time svm classifier high level features provided substantial input efficiency resulting stc systems according fusion systems results.",4 "automata networks multi-party communication naming game. naming game studied explore role self-organization development negotiation linguistic conventions. paper, define automata networks approach naming game. two problems faced: (1) definition automata networks multi-party communicative interactions; (2) proof convergence three different orders individuals updated (updating schemes). finally, computer simulations explored two-dimensional lattices purpose recover main features naming game describe dynamics different updating schemes.",4 "post-hoc labeling arbitrary eeg recordings data-efficient evaluation neural decoding methods. many cognitive, sensory motor processes correlates oscillatory neural sources, embedded subspace recorded brain signals. decoding processes noisy magnetoencephalogram/electroencephalogram (m/eeg) signals usually requires use data-driven analysis methods. objective evaluation decoding algorithms experimental raw signals, however, challenge: amount available m/eeg data typically limited, labels unreliable, raw signals often contaminated artifacts. latter specifically problematic, artifacts stem behavioral confounds oscillatory neural processes interest. overcome problems, simulation frameworks introduced benchmarking decoding methods. generating artificial brain signals, however, simulation frameworks make strong partially unrealistic assumptions brain activity, limits generalization obtained results real-world conditions. present contribution, thrive remove many shortcomings current simulation frameworks propose versatile alternative, allows objective evaluation benchmarking novel data-driven decoding methods neural signals. central idea utilize post-hoc labelings arbitrary m/eeg recordings. strategy makes paradigm-agnostic allows generate comparatively large datasets noiseless labels. source code data novel simulation approach made available facilitating adoption.",4 "integrating cardinal direction relations orientation relations qualitative spatial reasoning. propose calculus integrating two calculi well-known qualitative spatial reasoning (qsr): frank's projection-based cardinal direction calculus, coarser version freksa's relative orientation calculus. original constraint propagation procedure presented, implements interaction two integrated calculi. importance taking account interaction shown real example providing inconsistent knowledge base, whose inconsistency (a) cannot detected reasoning separately two components knowledge, because, taken separately, consistent, (b) detected proposed algorithm, thanks interaction knowledge propagated two compnents other.",4 "ppmf: patient-based predictive modeling framework early icu mortality prediction. date, developing good model early intensive care unit (icu) mortality prediction still challenging. paper presents patient based predictive modeling framework (ppmf) improve performance icu mortality prediction using data collected first 48 hours icu admission. ppmf consists three main components verifying three related research hypotheses. first component captures dynamic changes patients status icu using time series data (e.g., vital signs laboratory tests). second component local approximation algorithm classifies patients based similarities. third component gradient decent wrapper updates feature weights according classification feedback. experiments using data mimiciii show ppmf significantly outperforms: (1) severity score systems, namely sasp iii, apache iv, mpm0iii, (2) aggregation based classifiers utilize summarized time series, (3) baseline feature selection methods.",4 "improved anomaly detection crowded scenes via cell-based analysis foreground speed, size texture. robust efficient anomaly detection technique proposed, capable dealing crowded scenes traditional tracking based approaches tend fail. initial foreground segmentation input frames confines analysis foreground objects effectively ignores irrelevant background dynamics. input frames split non-overlapping cells, followed extracting features based motion, size texture cell. feature type independently analysed presence anomaly. unlike methods, refined estimate object motion achieved computing optical flow foreground pixels. motion size features modelled approximated version kernel density estimation, computationally efficient even large training datasets. texture features modelled adaptively grown codebook, number entries codebook selected online fashion. experiments recently published ucsd anomaly detection dataset show proposed method obtains considerably better results three recent approaches: mppca, social force, mixture dynamic textures (mdt). proposed method also several orders magnitude faster mdt, next best performing method.",4 "learning deep features one-class classification. propose deep learning-based solution problem feature learning one-class classification. proposed method operates top convolutional neural network (cnn) choice produces descriptive features maintaining low intra-class variance feature space given class. purpose two loss functions, compactness loss descriptiveness loss proposed along parallel cnn architecture. template matching-based framework introduced facilitate testing process. extensive experiments publicly available anomaly detection, novelty detection mobile active authentication datasets show proposed deep one-class (doc) classification method achieves significant improvements state-of-the-art.",4 "online deforestation detection. deforestation detection using satellite images make important contribution forest management. current approaches broadly divided compare two images taken similar periods year monitor changes using multiple images taken growing season. cmfda algorithm described zhu et al. (2012) algorithm builds latter category implementing year-long, continuous, time-series based approach monitoring images. algorithm developed 30m resolution, 16-day frequency reflectance data landsat satellite. work adapt algorithm 1km, 16-day frequency reflectance data modis sensor aboard terra satellite. cmfda algorithm composed two submodels fitted pixel-by-pixel basis. first estimates amount surface reflectance function day year. second estimates occurrence deforestation event comparing last predicted real reflectance values. comparison, reflectance observations six different bands first combined forest index. real predicted values forest index compared high absolute differences consecutive observation dates flagged deforestation events. adapted algorithm also uses two model framework. however, since modis 13a2 dataset used, includes reflectance data different spectral bands included landsat dataset, cannot construct forest index. instead propose two contrasting approaches: multivariate index approach similar cmfda.",19 "microstructure reconstruction using entropic descriptors. multi-scale approach inverse reconstruction pattern's microstructure reported. instead correlation function, pair entropic descriptors (eds) proposed stochastic optimization method. first measures spatial inhomogeneity, binary pattern, compositional one, greyscale image. second one quantifies spatial compositional statistical complexity. eds reveal structural information dissimilar, least part, given correlation functions almost discrete length scales. method tested digitized binary greyscale images. cases, persuasive reconstruction microstructure found.",3 "multichannel variable-size convolution sentence classification. propose mvcnn, convolution neural network (cnn) architecture sentence classification. (i) combines diverse versions pretrained word embeddings (ii) extracts features multigranular phrases variable-size convolution filters. also show pretraining mvcnn critical good performance. mvcnn achieves state-of-the-art performance four tasks: small-scale binary, small-scale multi-class largescale twitter sentiment prediction subjectivity classification.",4 "use dempster-shafer conflict metric detect interpretation inconsistency. model world built sensor data may incorrect even sensors functioning correctly. possible causes include use inappropriate sensors (e.g. laser looking glass walls), sensor inaccuracies accumulate (e.g. localization errors), priori models wrong, internal representation match world (e.g. static occupancy grid used dynamically moving objects). interested case constructed model world flawed, access ground truth would allow system see discrepancy, robot entering unknown environment. paper considers problem determining something wrong using sensor data used construct world model. proposes 11 interpretation inconsistency indicators based dempster-shafer conflict metric, con, evaluates indicators according three criteria: ability distinguish true inconsistency sensor noise (classification), estimate magnitude discrepancies (estimation), determine source(s) (if any) sensing problems environment (isolation). evaluation conducted using data mobile robot sonar laser range sensors navigating indoor environments controlled conditions. evaluation shows gambino indicator performed best terms estimation (at best 0.77 correlation), isolation, classification sensing situation degraded (7% false negative rate) normal (0% false positive rate).",4 "image retrieval fisher vectors binary features. recently, fisher vector representation local features attracted much attention effectiveness image classification image retrieval. another trend area image retrieval use binary features orb, freak, brisk. considering significant performance improvement accuracy image classification retrieval fisher vector continuous feature descriptors, fisher vector also applied binary features, would receive similar benefits binary feature based image retrieval classification. paper, derive closed-form approximation fisher vector binary features modeled bernoulli mixture model. also propose accelerating fisher vector using approximate value posterior probability. experiments show fisher vector representation significantly improves accuracy image retrieval compared bag binary words approach.",4 "sever: robust meta-algorithm stochastic optimization. high dimensions, machine learning methods brittle even small fraction structured outliers. address this, introduce new meta-algorithm take base learner least squares stochastic gradient descent, harden learner resistant outliers. method, sever, possesses strong theoretical guarantees yet also highly scalable -- beyond running base learner itself, requires computing top singular vector certain $n \times d$ matrix. apply sever drug design dataset spam classification dataset, find cases substantially greater robustness several baselines. spam dataset, $1\%$ corruptions, achieved $7.4\%$ test error, compared $13.4\%-20.5\%$ baselines, $3\%$ error uncorrupted dataset. similarly, drug design dataset, $10\%$ corruptions, achieved $1.42$ mean-squared error test error, compared $1.51$-$2.33$ baselines, $1.23$ error uncorrupted dataset.",4 "cryptocurrency portfolio management deep reinforcement learning. portfolio management decision-making process allocating amount fund different financial investment products. cryptocurrencies electronic decentralized alternatives government-issued money, bitcoin best-known example cryptocurrency. paper presents model-less convolutional neural network historic prices set financial assets input, outputting portfolio weights set. network trained 0.7 years' price data cryptocurrency exchange. training done reinforcement manner, maximizing accumulative return, regarded reward function network. backtest trading experiments trading period 30 minutes conducted market, achieving 10-fold returns 1.8 months' periods. recently published portfolio selection strategies also used perform back-tests, whose results compared neural network. network limited cryptocurrency, applied financial markets.",4 "bayesian uncertainty estimation batch normalized deep networks. deep neural networks led series breakthroughs, dramatically improving state-of-the-art many domains. techniques driving advances, however, lack formal method account model uncertainty. bayesian approach learning provides solid theoretical framework handle uncertainty, inference bayesian-inspired deep neural networks difficult. paper, provide practical approach bayesian learning relies regularization technique found nearly every modern network, \textit{batch normalization}. show training deep network using batch normalization equivalent approximate inference bayesian models, demonstrate finding allows us make useful estimates model uncertainty. approach, possible make meaningful uncertainty estimates using conventional architectures without modifying network training procedure. approach thoroughly validated series empirical experiments different tasks using various measures, outperforming baselines strong statistical significance displaying competitive performance recent bayesian approaches.",19 "physics-guided neural networks (pgnn): application lake temperature modeling. paper introduces novel framework combining scientific knowledge physics-based models neural networks advance scientific discovery. framework, termed physics-guided neural network (pgnn), leverages output physics-based model simulations along observational features generate predictions using neural network architecture. further, paper presents novel framework using physics-based loss functions learning objective neural networks, ensure model predictions show lower errors training set also scientifically consistent known physics unlabeled set. illustrate effectiveness pgnn problem lake temperature modeling, physical relationships temperature, density, depth water used design physics-based loss function. using scientific knowledge guide construction learning neural networks, able show proposed framework ensures better generalizability well scientific consistency results.",4 "new solution relative orientation problem using 3 points vertical direction. paper presents new method recover relative pose two images, using three points vertical direction information. vertical direction determined two ways: 1- using direct physical measurement like imu (inertial measurement unit), 2- using vertical vanishing point. knowledge vertical direction solves 2 unknowns among 3 parameters relative rotation, 3 homologous points requested position couple images. rewriting coplanarity equations leads simpler solution. remaining unknowns resolution performed algebraic method using grobner bases. elements necessary build specific algebraic solver given paper, allowing real-time implementation. results real synthetic data show efficiency method.",4 "optimal algorithm bandit zero-order convex optimization two-point feedback. consider closely related problems bandit convex optimization two-point feedback, zero-order stochastic convex optimization two function evaluations per round. provide simple algorithm analysis optimal convex lipschitz functions. improves \cite{dujww13}, provides optimal result smooth functions; moreover, algorithm analysis simpler, readily extend non-euclidean problems. algorithm based small surprisingly powerful modification gradient estimator.",4 "design statistical quality control procedures using genetic algorithms. general, use algebraic enumerative methods optimize quality control (qc) procedure detect critical random systematic analytical errors stated probabilities, probability false rejection minimum. genetic algorithms (gas) offer alternative, require knowledge objective function optimized search large parameter spaces quickly. explore application gas statistical qc, developed interactive gas based computer program designs novel near optimal qc procedure, given analytical process. program uses deterministic crowding algorithm. illustrative application program suggests potential design qc procedures significantly better 45 alternative ones used clinical laboratories.",4 "opportunistic adaptation knowledge discovery. adaptation long considered achilles' heel case-based reasoning since requires domain-specific knowledge difficult acquire. paper, two strategies combined order reduce knowledge engineering cost induced adaptation knowledge (ca) acquisition task: ca learned case base means knowledge discovery techniques, ca acquisition sessions opportunistically triggered, i.e., problem-solving time.",4 "bayesian test significance conditional independence: multinomial model. conditional independence tests (ci tests) received special attention lately machine learning computational intelligence related literature important indicator relationship among variables used models. field probabilistic graphical models (pgm)--which includes bayesian networks (bn) models--ci tests especially important task learning pgm structure data. paper, propose full bayesian significance test (fbst) tests conditional independence discrete datasets. fbst powerful bayesian test precise hypothesis, alternative frequentist's significance tests (characterized calculation \emph{p-value}).",19 "model shrinkage effect gamma process edge partition models. edge partition model (epm) fundamental bayesian nonparametric model extracting overlapping structure binary matrix. epm adopts gamma process ($\gamma$p) prior automatically shrink number active atoms. however, empirically found model shrinkage epm typically work appropriately leads overfitted solution. analysis expectation epm's intensity function suggested gamma priors epm hyperparameters disturb model shrinkage effect internal $\gamma$p. order ensure model shrinkage effect epm works appropriate manner, proposed two novel generative constructions epm: cepm incorporating constrained gamma priors, depm incorporating dirichlet priors instead gamma priors. furthermore, depm's model parameters including infinite atoms $\gamma$p prior could marginalized out, thus possible derive truly infinite depm (idepm) efficiently inferred using collapsed gibbs sampler. experimentally confirmed model shrinkage proposed models works well idepm indicated state-of-the-art performance generalization ability, link prediction accuracy, mixing efficiency, convergence speed.",19 "detection resolution rumours social media: survey. despite increasing use social media platforms information news gathering, unmoderated nature often leads emergence spread rumours, i.e. pieces information unverified time posting. time, openness social media platforms provides opportunities study users share discuss rumours, explore natural language processing data mining techniques may used find ways determining veracity. survey introduce discuss two types rumours circulate social media; long-standing rumours circulate long periods time, newly-emerging rumours spawned fast-paced events breaking news, reports released piecemeal often unverified status early stages. provide overview research social media rumours ultimate goal developing rumour classification system consists four components: rumour detection, rumour tracking, rumour stance classification rumour veracity classification. delve approaches presented scientific literature development four components. summarise efforts achievements far towards development rumour classification systems conclude suggestions avenues future research social media mining detection resolution rumours.",4 "study unsupervised adaptive crowdsourcing. consider unsupervised crowdsourcing performance based model wherein responses end-users essentially rated according responses correlate majority responses subtasks/questions. one setting, consider independent sequence identically distributed crowdsourcing assignments (meta-tasks), consider single assignment large number component subtasks. problems yield intuitive results overall reliability crowd factor.",4 "group symmetry non-gaussian covariance estimation. consider robust covariance estimation group symmetry constraints. non-gaussian covariance estimation, e.g., tyler scatter estimator multivariate generalized gaussian distribution methods, usually involve non-convex minimization problems. recently, shown underlying principle behind success extended form convexity geodesics manifold positive definite matrices. modern approach improve estimation accuracy exploit prior knowledge via additional constraints, e.g., restricting attention specific classes covariances adhere prior symmetry structures. paper, prove group symmetry constraints also geodesically convex therefore incorporated various non-gaussian covariance estimators. practical examples sets include: circulant, persymmetric complex/quaternion proper structures. provide simple numerical technique finding maximum likelihood estimates constraints, demonstrate performance advantage using synthetic experiments.",19 "learning, investments derivatives. recent crisis following flight simplicity put derivative businesses around world considerable pressure. argue traditional modeling techniques must extended include product design. propose quantitative framework creating products meet challenge optimal investors point view remaining relatively simple transparent.",17 "rewriting constraint models metamodels. important challenge constraint programming rewrite constraint models executable programs calculat- ing solutions. phase constraint processing may require translations constraint programming lan- guages, transformations constraint representations, model optimizations, tuning solving strategies. paper, introduce pivot metamodel describing common fea- tures constraint models including different kinds con- straints, statements like conditionals loops, first-class elements like object classes predicates. metamodel general enough cope constructions many languages, object-oriented modeling languages logic languages, independent them. rewriting operations manipulate metamodel instances apart languages. consequence, rewriting operations apply whatever languages selected able manage model semantic information. bridge created metamodel space languages using parsing techniques. tools software engineering world useful implement framework.",4 "curious robot: learning visual representations via physical interactions. right supervisory signal train visual representations? current approaches computer vision use category labels datasets imagenet train convnets. however, case biological agents, visual representation learning require millions semantic labels. argue biological agents use physical interactions world learn visual representations unlike current vision systems use passive observations (images videos downloaded web). example, babies push objects, poke them, put mouth throw learn representations. towards goal, build one first systems baxter platform pushes, pokes, grasps observes objects tabletop environment. uses four different types physical interactions collect 130k datapoints, datapoint providing supervision shared convnet architecture allowing us learn visual representations. show quality learned representations observing neuron activations performing nearest neighbor retrieval learned representation. quantitatively, evaluate learned convnet image classification tasks show improvements compared learning without external data. finally, task instance retrieval, network outperforms imagenet network recall@1 3%",4 "deep multimodal semantic embeddings speech images. paper, present model takes input corpus images relevant spoken captions finds correspondence two modalities. employ pair convolutional neural networks model visual objects speech signals word level, tie networks together embedding alignment model learns joint semantic space modalities. evaluate model using image search annotation tasks flickr8k dataset, augmented collecting corpus 40,000 spoken captions using amazon mechanical turk.",4 "using english pivot extract persian-italian parallel sentences non-parallel corpora. effectiveness statistical machine translation system (smt) dependent upon amount parallel corpus used training phase. low-resource language pairs enough parallel corpora build accurate smt. paper, novel approach presented extract bilingual persian-italian parallel sentences non-parallel (comparable) corpus. study, english used pivot language compute matching scores source target sentences candidate selection phase. additionally, new monolingual sentence similarity metric, normalized google distance (ngd) proposed improve matching process. moreover, extensions baseline system applied improve quality extracted sentences measured bleu. experimental results show using new pivot based extraction increase quality bilingual corpus significantly consequently improves performance persian-italian smt system.",4 "learning repeat: fine grained action repetition deep reinforcement learning. reinforcement learning algorithms learn complex behavioral patterns sequential decision making tasks wherein agent interacts environment acquires feedback form rewards sampled it. traditionally, algorithms make decisions, i.e., select actions execute, every single time step agent-environment interactions. paper, propose novel framework, fine grained action repetition (figar), enables agent decide action well time scale repeating it. figar used improving deep reinforcement learning algorithm maintains explicit policy estimate enabling temporal abstractions action space. empirically demonstrate efficacy framework showing performance improvements top three policy search algorithms different domains: asynchronous advantage actor critic atari 2600 domain, trust region policy optimization mujoco domain deep deterministic policy gradients torcs car racing domain.",4 "trainable neuromorphic integrated circuit exploits device mismatch. random device mismatch arises result scaling cmos (complementary metal-oxide semi-conductor) technology deep submicron regime degrades accuracy analogue circuits. methods combat increase complexity design. developed novel neuromorphic system called trainable analogue block (tab), exploits device mismatch means random projections input higher dimensional space. tab framework inspired principles neural population coding operating biological nervous system. three neuronal layers, namely input, hidden, output, constitute tab framework, number hidden layer neurons far exceeding input layer neurons. here, present measurement results first prototype tab chip built using 65nm process technology show learning capability various regression tasks. tab chip exploits inherent randomness variability arising due fabrication process perform various learning tasks. additionally, characterise neuron discuss statistical variability tuning curve arises due random device mismatch, desirable property learning capability tab. also discuss effect number hidden neurons resolution output weights accuracy learning capability tab.",4 "temporal tessellation: unified approach video analysis. present general approach video understanding, inspired semantic transfer techniques successfully used 2d image analysis. method considers video 1d sequence clips, one associated semantics. nature semantics -- natural language captions labels -- depends task hand. test video processed forming correspondences clips clips reference videos known semantics, following which, reference semantics transferred test video. describe two matching methods, designed ensure (a) reference clips appear similar test clips (b), taken together, semantics selected reference clips consistent maintains temporal coherence. use method video captioning lsmdc'16 benchmark, video summarization summe tvsum benchmarks, temporal action detection thumos2014 benchmark, sound prediction greatest hits benchmark. method surpasses state art, four five benchmarks, importantly, single method know successfully applied diverse range tasks.",4 "listen, interact talk: learning speak via interaction. one long-term goals artificial intelligence build agent communicate intelligently human natural language. existing work natural language learning relies heavily training pre-collected dataset annotated labels, leading agent essentially captures statistics fixed external training data. training data essentially static snapshot representation knowledge annotator, agent trained way limited adaptiveness generalization behavior. moreover, different language learning process humans, language acquired communication taking speaking action learning consequences speaking action interactive manner. paper presents interactive setting grounded natural language learning, agent learns natural language interacting teacher learning feedback, thus learning improving language skills taking part conversation. achieve goal, propose model incorporates imitation reinforcement leveraging jointly sentence reward feedbacks teacher. experiments conducted validate effectiveness proposed approach.",4 "combining recurrent convolutional neural networks relation classification. paper investigates two different neural architectures task relation classification: convolutional neural networks recurrent neural networks. models, demonstrate effect different architectural choices. present new context representation convolutional neural networks relation classification (extended middle context). furthermore, propose connectionist bi-directional recurrent neural networks introduce ranking loss optimization. finally, show combining convolutional recurrent neural networks using simple voting scheme accurate enough improve results. neural models achieve state-of-the-art results semeval 2010 relation classification task.",4 "hilbert space methods reduced-rank gaussian process regression. paper proposes novel scheme reduced-rank gaussian process regression. method based approximate series expansion covariance function terms eigenfunction expansion laplace operator compact subset $\mathbb{r}^d$. approximate eigenbasis eigenvalues covariance function expressed simple functions spectral density gaussian process, allows gp inference solved computational cost scaling $\mathcal{o}(nm^2)$ (initial) $\mathcal{o}(m^3)$ (hyperparameter learning) $m$ basis functions $n$ data points. approach also allows rigorous error analysis hilbert space theory, show approximation becomes exact size compact subset number eigenfunctions go infinity. expansion generalizes hilbert spaces inner product defined integral specified input density. method compared previously proposed methods theoretically empirical tests simulated real data.",19 "additive model view sparse gaussian process classifier design. consider problem designing sparse gaussian process classifier (sgpc) generalizes well. viewing sgpc design constructing additive model like boosting, present efficient effective sgpc design method perform stage-wise optimization predictive loss function. introduce new methods two key components viz., site parameter estimation basis vector selection sgpc design. proposed adaptive sampling based basis vector selection method aids achieving improved generalization performance reduced computational cost. method also used conjunction site parameter estimation methods. similar computational storage complexities well-known information vector machine suitable large datasets. hyperparameters determined optimizing predictive loss function. experimental results show better generalization performance proposed basis vector selection method several benchmark datasets, particularly relatively smaller basis vector set sizes difficult datasets.",4 "evaluating link-based techniques detecting fake pharmacy websites. fake online pharmacies become increasingly pervasive, constituting 90% online pharmacy websites. need fake website detection techniques capable identifying fake online pharmacy websites high degree accuracy. study, compared several well-known link-based detection techniques large-scale test bed hyperlink graph encompassing 80 million links 15.5 million web pages, including 1.2 million known legitimate fake pharmacy pages. found qoc qol class propagation algorithms achieved accuracy 90% dataset. results revealed algorithms incorporate dual class propagation well inlink outlink information, page-level site-level graphs, better suited detecting fake pharmacy websites. addition, site-level analysis yielded significantly better results page-level analysis algorithms evaluated.",4 "path algorithm fused lasso signal approximator. lasso well known penalized regression model, adds $l_{1}$ penalty parameter $\lambda_{1}$ coefficients squared error loss function. fused lasso extends model also putting $l_{1}$ penalty parameter $\lambda_{2}$ difference neighboring coefficients, assuming natural ordering. paper, develop fast path algorithm solving fused lasso signal approximator computes solutions values $\lambda_1$ $\lambda_2$. supplement, also give algorithm general fused lasso case predictor matrix $\bx \in \mathds{r}^{n \times p}$ $\text{rank}(\bx)=p$.",19 "narrativeqa reading comprehension challenge. reading comprehension (rc)---in contrast information retrieval---requires integrating information reasoning events, entities, relations across full document. question answering conventionally used assess rc ability, artificial agents children learning read. however, existing rc datasets tasks dominated questions solved selecting answers using superficial information (e.g., local context similarity global term frequency); thus fail test essential integrative aspect rc. encourage progress deeper comprehension language, present new dataset set tasks reader must answer questions stories reading entire books movie scripts. tasks designed successfully answering questions requires understanding underlying narrative rather relying shallow pattern matching salience. show although humans solve tasks easily, standard rc models struggle tasks presented here. provide analysis dataset challenges presents.",4 "uniform deviation bounds unbounded loss functions like k-means. uniform deviation bounds limit difference model's expected loss loss empirical sample uniformly models learning problem. such, critical component empirical risk minimization. paper, provide novel framework obtain uniform deviation bounds loss functions *unbounded*. main application, allows us obtain bounds $k$-means clustering weak assumptions underlying distribution. fourth moment bounded, prove rate $\mathcal{o}\left(m^{-\frac12}\right)$ compared previously known $\mathcal{o}\left(m^{-\frac14}\right)$ rate. furthermore, show rate also depends kurtosis - normalized fourth moment measures ""tailedness"" distribution. provide improved rates progressively stronger assumptions, namely, bounded higher moments, subgaussianity bounded support.",19 "constraint-satisfaction parser context-free grammars. traditional language processing tools constrain language designers specific kinds grammars. contrast, model-based language specification decouples language design language processing. consequence, model-based language specification tools need general parsers able parse unrestricted context-free grammars. languages specified following approach may ambiguous, parsers must deal ambiguities. model-based language specification also allows definition associativity, precedence, custom constraints. therefore parsers generated model-driven language specification tools need enforce constraints. paper, propose fence, efficient bottom-up chart parser lexical syntactic ambiguity support allows specification constraints and, therefore, enables use model-based language specification practice.",4 "planning graph (dynamic) csp: exploiting ebl, ddb csp search techniques graphplan. paper reviews connections graphplan's planning-graph dynamic constraint satisfaction problem motivates need adapting csp search techniques graphplan algorithm. describes explanation based learning, dependency directed backtracking, dynamic variable ordering, forward checking, sticky values random-restart search strategies adapted graphplan. empirical results provided demonstrate augmentations improve graphplan's performance significantly (up 1000x speedups) several benchmark problems. special attention paid explanation-based learning dependency directed backtracking techniques empirically found useful improving performance graphplan.",4 "online object tracking proposal selection. tracking-by-detection approaches successful object trackers recent years. success largely determined detector model learn initially update time. however, challenging conditions object undergo transformations, e.g., severe rotation, methods found lacking. paper, address problem formulating proposal selection task making two contributions. first one introducing novel proposals estimated geometric transformations undergone object, building rich candidate set predicting object location. second one devising novel selection strategy using multiple cues, i.e., detection score edgeness score computed state-of-the-art object edges motion boundaries. extensively evaluate approach visual object tracking 2014 challenge online tracking benchmark datasets, show best performance.",4 "empirical analysis multiple-turn reasoning strategies reading comprehension tasks. reading comprehension (rc) challenging task requires synthesis information across sentences multiple turns reasoning. using state-of-the-art rc model, empirically investigate performance single-turn multiple-turn reasoning squad ms marco datasets. rc model end-to-end neural network iterative attention, uses reinforcement learning dynamically control number turns. find multiple-turn reasoning outperforms single-turn reasoning question answer types; further, observe enabling flexible number turns generally improves upon fixed multiple-turn strategy. %across question types, particularly beneficial questions lengthy, descriptive answers. achieve results competitive state-of-the-art two datasets.",4 "sharing hash codes multiple purposes. locality sensitive hashing (lsh) powerful tool sublinear-time approximate nearest neighbor search, variety hashing schemes proposed different dissimilarity measures. however, hash codes significantly depend dissimilarity, prohibits users adjusting dissimilarity query time. paper, propose {multiple purpose lsh (mp-lsh) shares hash codes different dissimilarities. mp-lsh supports l2, cosine, inner product dissimilarities, corresponding weighted sums, weights adjusted query time. also allows us modify importance pre-defined groups features. thus, mp-lsh enables us, example, retrieve similar items query user preference taken account, find similar material query properties (stability, utility, etc.) optimized, turn part multi-modal information (brightness, color, audio, text, etc.) image/video retrieval. theoretically empirically analyze performance three variants mp-lsh, demonstrate usefulness real-world data sets.",19 "learning non-gaussian time series using box-cox gaussian process. gaussian processes (gps) bayesian nonparametric generative models provide interpretability hyperparameters, admit closed-form expressions training inference, able accurately represent uncertainty. model general non-gaussian data complex correlation structure, gps paired expressive covariance kernel fed nonlinear transformation (or warping). however, overparametrising kernel warping known to, respectively, hinder gradient-based training make predictions computationally expensive. remedy issue (i) training model using derivative-free global-optimisation techniques find meaningful maxima model likelihood, (ii) proposing warping function based celebrated box-cox transformation requires minimal numerical approximations---unlike existing warped gp models. validate proposed approach first showing predictions computed analytically, learning, reconstruction forecasting experiment using real-world datasets.",19 "note sample complexity learning binary output neural networks fixed input distributions. show learning sample complexity sigmoidal neural network constructed sontag (1992) required achieve given misclassification error fixed purely atomic distribution grow arbitrarily fast: prescribed rate growth input distribution rate sample complexity, bound asymptotically tight. rate superexponential, non-recursive function, etc. observe sontag's ann glivenko-cantelli input distribution non-atomic part.",4 "invariant scattering convolution networks. wavelet scattering network computes translation invariant image representation, stable deformations preserves high frequency information classification. cascades wavelet transform convolutions non-linear modulus averaging operators. first network layer outputs sift-type descriptors whereas next layers provide complementary invariant information improves classification. mathematical analysis wavelet scattering networks explains important properties deep convolution networks classification. scattering representation stationary processes incorporates higher order moments thus discriminate textures fourier power spectrum. state art classification results obtained handwritten digits texture discrimination, using gaussian kernel svm generative pca classifier.",4 "interpretable 3d human action analysis temporal convolutional networks. discriminative power modern deep learning models 3d human action recognition growing ever potent. conjunction recent resurgence 3d human action representation 3d skeletons, quality pace recent progress significant. however, inner workings state-of-the-art learning based methods 3d human action recognition still remain mostly black-box. work, propose use new class models known temporal convolutional neural networks (tcn) 3d human action recognition. compared popular lstm-based recurrent neural network models, given interpretable input 3d skeletons, tcn provides us way explicitly learn readily interpretable spatio-temporal representations 3d human action recognition. provide strategy re-designing tcn interpretability mind characteristics model leveraged construct powerful 3d activity recognition method. work, wish take step towards spatio-temporal model easier understand, explain interpret. resulting model, res-tcn, achieves state-of-the-art results largest 3d human action recognition dataset, ntu-rgbd.",4 "generalized end-to-end loss speaker verification. paper, propose new loss function called generalized end-to-end (ge2e) loss, makes training speaker verification models efficient previous tuple-based end-to-end (te2e) loss function. unlike te2e, ge2e loss function updates network way emphasizes examples difficult verify step training process. additionally, ge2e loss require initial stage example selection. properties, model new loss function decreases speaker verification eer 10%, reducing training time 60% time. also introduce multireader technique, allows us domain adaptation - training accurate model supports multiple keywords (i.e. ""ok google"" ""hey google"") well multiple dialects.",6 "introduction bag features paradigm image classification retrieval. past decade seen growing popularity bag features (bof) approaches many computer vision tasks, including image classification, video search, robot localization, texture recognition. part appeal simplicity. bof methods based orderless collections quantized local image descriptors; discard spatial information therefore conceptually computationally simpler many alternative methods. despite this, perhaps this, bof-based systems set new performance standards popular image classification benchmarks achieved scalability breakthroughs image retrieval. paper presents introduction bof image representations, describes critical design choices, surveys bof literature. emphasis placed recent techniques mitigate quantization errors, improve feature detection, speed image retrieval. time, unresolved issues fundamental challenges raised. among unresolved issues determining best techniques sampling images, describing local image features, evaluating system performance. among fundamental challenges whether bof methods contribute localizing objects complex images, associating high-level semantics natural images. survey useful introducing new investigators field providing existing researchers consolidated reference related work.",4 "chases escapes, optimization problems. propose new approach solving combinatorial optimization problem utilizing mechanism chases escapes, long history mathematics. addition well-used steepest descent neighboring search, perform chase escape game ""landscape"" cost function. created concrete algorithm traveling salesman problem. preliminary test indicates possibility new fusion chases escapes problem combinatorial optimization search fruitful.",4 "permutation nmf. nonnegative matrix factorization(nmf) common used technique machine learning extract features data text documents images thanks natural clustering properties. particular, popular image processing since decompose several pictures recognize common parts they're located position photos. paper's aim present way add translation invariance classical nmf, is, algorithms presented able detect common features, even they're shifted, different original images.",4 "argumentation system reasoning conflict-minimal paraconsistent alc. semantic web open distributed environment hard guarantee consistency knowledge information. standard two-valued semantics everything entailed knowledge information inconsistent. semantics paraconsistent logic lp offers solution. however, available knowledge information consistent, set conclusions entailed three-valued semantics paraconsistent logic lp smaller set conclusions entailed two-valued semantics. preferring conflict-minimal three-valued interpretations eliminates difference. preferring conflict-minimal interpretations introduces non-monotonicity. handle non-monotonicity, paper proposes assumption-based argumentation system. assumptions needed close branches semantic tableaux form arguments. stable extensions set derived arguments correspond conflict minimal interpretations conclusions entailed conflict-minimal interpretations supported arguments stable extensions.",4 "nonparametric sparse representation. paper suggests nonparametric scheme find sparse solution underdetermined system linear equations presence unknown impulsive non-gaussian noise. approach robust variations noise model parameters. based minimization rank pseudo norm residual signal l_1-norm signal interest, simultaneously. use steepest descent method find sparse solution via iterative algorithm. simulation results show proposed method outperforms existence methods like omp, bp, lasso, bcs whenever observation vector contaminated measurement environmental non-gaussian noise unknown parameters. furthermore, low snr condition, proposed method better performance presence gaussian noise.",4 "hybrid approach hindi-english machine translation. paper, extended combined approach phrase based statistical machine translation (smt), example based mt (ebmt) rule based mt (rbmt) proposed develop novel hybrid data driven mt system capable outperforming baseline smt, ebmt rbmt systems derived. short, proposed hybrid mt process guided rule based mt getting set partial candidate translations provided ebmt smt subsystems. previous works shown ebmt systems capable outperforming phrase-based smt systems rbmt approach strength generating structurally morphologically accurate results. hybrid approach increases fluency, accuracy grammatical precision improve quality machine translation system. comparison proposed hybrid machine translation (htm) model renowned translators i.e. google, bing babylonian also presented shows proposed model works better sentences ambiguity well comprised idioms others.",4 "real-time distracted driver posture classification. distracted driving worldwide problem leading astoundingly increasing number accidents deaths. existing work concerned small set distractions (mostly, cell phone usage). also, part, uses unreliable ad-hoc methods detect distractions. paper, present first publicly available dataset ""distracted driver"" posture estimation distraction postures existing alternatives. addition, propose reliable system achieves 95.98% driving posture classification accuracy. system consists genetically-weighted ensemble convolutional neural networks (cnns). show weighted ensemble classifiers using genetic algorithm yields better classification confidence. also study effect different visual elements (i.e. hands face) distraction detection means face hand localizations. finally, present thinned version ensemble could achieve 94.29% classification accuracy operate real-time environment.",4 applications fuzzy logic case-based reasoning. article discusses applications fuzzy logic ideas formalizing case-based reasoning (cbr) process measuring effectiveness cbr systems,4 "spectral clustering jensen-type kernels multi-point extensions. motivated multi-distribution divergences, originate information theory, propose notion `multi-point' kernels, study applications. study class kernels based jensen type divergences show extended measure similarity among multiple points. study tensor flattening methods develop multi-point (kernel) spectral clustering (msc) method. emphasize special case proposed kernels, multi-point extension linear (dot-product) kernel show existence cubic time tensor flattening algorithm case. finally, illustrate usefulness contributions using standard data sets image segmentation tasks.",4 filament flare detection hα image sequences. solar storms major impact infrastructure earth. causing events observable ground h{\alpha} spectral line. paper propose new method simultaneous detection flares filaments h{\alpha} image sequences. therefore perform several preprocessing steps enhance normalize images. based intensity values segment image variational approach. final postprecessing step derive essential properties classify events demonstrate performance comparing obtained results data annotated expert. information produced method used near real-time alerts statistical analysis existing data solar physicists.,4 "replica exchange using q-gaussian swarm quantum particle intelligence method. present newly developed replica exchange algorithm using q -gaussian swarm quantum particle optimization (rex@q-gsqpo) method solving problem finding global optimum. basis algorithm run multiple copies independent swarms different values q parameter. based energy criterion, chosen satisfy detailed balance, swapping particle coordinates neighboring swarms regular iteration intervals. swarm replicas high q values characterized high diversity particles allowing escaping local minima faster, low q replicas, characterized low diversity particles, used sample efficiently local basins. compare new algorithm standard gaussian swarm quantum particle optimization (gsqpo) q-gaussian swarm quantum particle optimization (q-gsqpo) algorithms, found new algorithm robust terms number fitness function calls, efficient terms ability convergence global minimum. additional, also provide method optimally allocating swarm replicas among different q values. algorithm tested three benchmark functions, known multimodal problems, different dimensionalities. addition, considered polyalanine peptide 12 residues modeled using g\=o coarse-graining potential energy function.",4 "learning deep convolutional features mri based alzheimer's disease classification. effective accurate diagnosis alzheimer's disease (ad) mild cognitive impairment (mci) critical early treatment thus attracted attention nowadays. since first introduced, machine learning methods gaining increasing popularity ad related research. among various identified biomarkers, magnetic resonance imaging (mri) widely used prediction ad mci. however, machine learning algorithm applied, image features need extracted represent mri images. good representations pivotal classification performance, almost previous studies typically rely human labelling find regions interest (roi) may correlated ad, hippocampus, amygdala, precuneus, etc. procedure requires domain knowledge costly tedious. instead relying extraction roi features, promising remove manual roi labelling pipeline directly work raw mri images. words, let machine learning methods figure informative discriminative image structures ad classification. work, propose learn deep convolutional image features using unsupervised supervised learning. deep learning emerged powerful tool machine learning community successfully applied various tasks. thus propose exploit deep features mri images based pre-trained large convolutional neural network (cnn) ad mci classification, spares effort manual roi annotation process.",4 "feature importance bayesian assessment newborn brain maturity eeg. methodology bayesian model averaging (bma) applied assessment newborn brain maturity sleep eeg. theory methodology provides accurate assessments uncertainty decisions. however, existing bma techniques shown providing biased assessments absence prior information enabling explore model parameter space details within reasonable time. lack details leads disproportional sampling posterior distribution. case eeg assessment brain maturity, bma results biased absence information eeg feature importance. paper explore posterior information eeg features used order reduce negative impact disproportional sampling bma performance. use eeg data recorded sleeping newborns test efficiency proposed bma technique.",4 "accnet: actor-coordinator-critic net ""learning-to-communicate"" deep multi-agent reinforcement learning. communication critical factor big multi-agent world stay organized productive. typically, previous multi-agent ""learning-to-communicate"" studies try predefine communication protocols use technologies tabular reinforcement learning evolutionary algorithm, generalize changing environment large collection agents. paper, propose actor-coordinator-critic net (accnet) framework solving ""learning-to-communicate"" problem. accnet naturally combines powerful actor-critic reinforcement learning technology deep learning technology. efficiently learn communication protocols even scratch partially observable environment. demonstrate accnet achieve better results several baselines continuous discrete action space environments. also analyse learned protocols discuss design considerations.",4 "random binary mappings kernel learning efficient svm. support vector machines (svms) powerful learners led state-of-the-art results various computer vision problems. svms suffer various drawbacks terms selecting right kernel, depends image descriptors, well computational memory efficiency. paper introduces novel kernel, serves issues well. kernel learned exploiting large amount low-complex, randomized binary mappings input feature. leads efficient svm, also alleviating task kernel selection. demonstrate capabilities kernel 6 standard vision benchmarks, combine several common image descriptors, namely histograms (flowers17 daimler), attribute-like descriptors (uci, osr, a-voc08), sparse quantization (imagenet). results show kernel learning adapts well different descriptors types, achieving performance kernels specifically tuned image descriptor, similar evaluation cost efficient svm methods.",4 "deep learning based large scale visual recommendation search e-commerce. paper, present unified end-to-end approach build large scale visual search recommendation system e-commerce. previous works targeted problems isolation. believe effective elegant solution could obtained tackling together. propose unified deep convolutional neural network architecture, called visnet, learn embeddings capture notion visual similarity, across several semantic granularities. demonstrate superiority approach task image retrieval, comparing state-of-the-art exact street2shop dataset. share design decisions trade-offs made deploying model power visual recommendations across catalog 50m products, supporting 2k queries second flipkart, india's largest e-commerce company. deployment solution yielded significant business impact, measured conversion-rate.",4 "fast convergent algorithms expectation propagation approximate bayesian inference. propose novel algorithm solve expectation propagation relaxation bayesian inference continuous-variable graphical models. contrast previous algorithms, method provably convergent. marrying convergent ep ideas (opper&winther 05) covariance decoupling techniques (wipf&nagarajan 08, nickisch&seeger 09), runs least order magnitude faster commonly used ep solver.",19 "code completion neural attention pointer networks. intelligent code completion become essential tool accelerate modern software development. facilitate effective code completion dynamically-typed programming languages, apply neural language models learning large codebases, investigate effectiveness attention mechanism code completion task. however, standard neural language models even attention mechanism cannot correctly predict out-of-vocabulary (oov) words thus restrict code completion performance. paper, inspired prevalence locally repeated terms program source code, recently proposed pointer networks reproduce words local context, propose pointer mixture network better predicting oov words code completion. based context, pointer mixture network learns either generate within-vocabulary word rnn component, copy oov word local context pointer component. experiments two benchmarked datasets demonstrate effectiveness attention mechanism pointer mixture network code completion task.",4 "using dissortative mating genetic algorithms track extrema dynamic deceptive functions. traditional genetic algorithms (gas) mating schemes select individuals crossover independently genotypic phenotypic similarities. nature, behaviour known random mating. however, non-random schemes - individuals mate according kinship likeness - common natural systems. previous studies indicate that, applied gas, negative assortative mating (a specific type non-random mating, also known dissortative mating) may improve performance (on speed reliability) wide range problems. dissortative mating maintains genetic diversity higher level run, fact frequently observed explanation dissortative gas ability escape local optima traps. dynamic problems, due specificities, demand special care tuning ga, diversity plays even crucial role tackling static ones. paper investigates behaviour dissortative mating gas, namely recently proposed adaptive dissortative mating ga (admga), dynamic trap functions. admga selects parents according hamming distance, via self-adjustable threshold value. method, keeping population diversity run, provides effective means deal dynamic problems. tests conducted deceptive nearly deceptive trap functions indicate admga able outperform gas, specifically designed tracking moving extrema, wide range tests, particularly effective speed change fast. comparing algorithm previously proposed dissortative ga, results show performance equivalent majority experiments, admga performs better solving hardest instances test set.",4 "seeing small faces robust anchor's perspective. paper introduces novel anchor design support anchor-based face detection superior scale-invariant performance, especially tiny faces. achieve this, explicitly address problem anchor-based detectors drop performance drastically faces tiny sizes, e.g. less 16x16 pixels. paper, investigate case. discover current anchor design cannot guarantee high overlaps tiny faces anchor boxes, increases difficulty training. new expected max overlapping (emo) score proposed theoretically explain low overlapping issue inspire several effective strategies new anchor design leading higher face overlaps, including anchor stride reduction new network architectures, extra shifted anchors, stochastic face shifting. comprehensive experiments show proposed method significantly outperforms baseline anchor-based detector, consistently achieving state-of-the-art results challenging face detection datasets competitive runtime speed.",4 "imprecise probability assessments conditional probabilities quasi additive classes conditioning events. paper, starting generalized coherent (i.e. avoiding uniform loss) intervalvalued probability assessment finite family conditional events, construct conditional probabilities quasi additive classes conditioning events consistent given initial assessment. quasi additivity assures coherence obtained conditional probabilities. order reach goal define finite sequence conditional probabilities exploiting theoretical results g-coherence. particular, use solutions finite sequence linear systems.",4 "towards end-to-end speech recognition deep convolutional neural networks. convolutional neural networks (cnns) effective models reducing spectral variations modeling spectral correlations acoustic features automatic speech recognition (asr). hybrid speech recognition systems incorporating cnns hidden markov models/gaussian mixture models (hmms/gmms) achieved state-of-the-art various benchmarks. meanwhile, connectionist temporal classification (ctc) recurrent neural networks (rnns), proposed labeling unsegmented sequences, makes feasible train end-to-end speech recognition system instead hybrid settings. however, rnns computationally expensive sometimes difficult train. paper, inspired advantages cnns ctc approach, propose end-to-end speech framework sequence labeling, combining hierarchical cnns ctc directly without recurrent connections. evaluating approach timit phoneme recognition task, show proposed model computationally efficient, also competitive existing baseline systems. moreover, argue cnns capability model temporal correlations appropriate context information.",4 "learning tensors reproducing kernel hilbert spaces multilinear spectral penalties. present general framework learn functions tensor product reproducing kernel hilbert spaces (tp-rkhss). methodology based novel representer theorem suitable existing well new spectral penalties tensors. functions tp-rkhs defined cartesian product finite discrete sets, particular, main problem formulation admits special case existing tensor completion problems. special cases include transfer learning multimodal side information multilinear multitask learning. latter case, kernel-based view instrumental derive nonlinear extensions existing model classes. give novel algorithm show experiments usefulness proposed extensions.",4 "intra-and-inter-constraint-based video enhancement based piecewise tone mapping. video enhancement plays important role various video applications. paper, propose new intra-and-inter-constraint-based video enhancement approach aiming 1) achieve high intra-frame quality entire picture multiple region-of-interests (rois) adaptively simultaneously enhanced, 2) guarantee inter-frame quality consistencies among video frames. first analyze features different rois create piecewise tone mapping curve entire frame intra-frame quality frame enhanced. introduce new inter-frame constraints improve temporal quality consistency. experimental results show proposed algorithm obviously outperforms state-of-the-art algorithms.",4 "survey visual analysis human motion applications. paper summarizes recent progress human motion analysis applications. beginning, reviewed motion capture systems representation model human's motion data. next, sketched advanced human motion data processing technologies, including motion data filtering, temporal alignment, segmentation. following parts overview state-of-the-art approaches action recognition dynamics measuring since two active research areas human motion analysis. last part discusses emerging applications human motion analysis healthcare, human robot interaction, security surveillance, virtual reality animation. promising research topics human motion analysis future also summarized last part.",4 "lexical analysis tool ambiguity support. lexical ambiguities naturally arise languages. present lamb, lexical analyzer produces lexical analysis graph describing possible sequences tokens found within input string. parsers process lexical analysis graphs discard sequence tokens produce valid syntactic sentence, therefore performing, together lamb, context-sensitive lexical analysis lexically-ambiguous language specifications.",4 "random weights texture generation one layer neural networks. recent work literature shown experimentally one use lower layers trained convolutional neural network (cnn) model natural textures. interestingly, also experimentally shown one layer random filters also model textures although less variability. paper ask question one layer cnns random filters effective generating textures? theoretically show one layer convolutional architectures (without non-linearity) paired energy function used previous literature, fact preserve modulate frequency coefficients manner random weights pretrained weights generate type images. based results analysis question whether similar properties hold case one uses one convolution layer non-linearity. show case relu non-linearity situations one input give minimum possible energy whereas case nonlinearity, always infinite solutions give minimum possible energy. thus show certain situations adding relu non-linearity generates less variable images.",4 "learning understand phrases embedding dictionary. distributional models learn rich semantic word representations success story recent nlp research. however, developing models learn useful representations phrases sentences proved far harder. propose using definitions found everyday dictionaries means bridging gap lexical phrasal semantics. neural language embedding models effectively trained map dictionary definitions (phrases) (lexical) representations words defined definitions. present two applications architectures: ""reverse dictionaries"" return name concept given definition description general-knowledge crossword question answerers. tasks, neural language embedding models trained definitions handful freely-available lexical resources perform well better existing commercial systems rely significant task-specific engineering. results highlight effectiveness neural embedding architectures definition-based training developing models understand phrases sentences.",4 "leveraging sparse dense feature combinations sentiment classification. neural networks one popular approaches many natural language processing tasks sentiment analysis. often outperform traditional machine learning models achieve state-of-art results tasks. however, many existing deep learning models complex, difficult train provide limited improvement simpler methods. propose simple, robust powerful model sentiment classification. model outperforms many deep learning models achieves comparable results deep learning models complex architectures sentiment analysis datasets. publish code online.",4 "synthesis supervised classification algorithm using intelligent statistical tools. fundamental task detecting foreground objects static dynamic scenes take best choice color system representation efficient technique background modeling. propose paper non-parametric algorithm dedicated segment detect objects color images issued football sports meeting. indeed segmentation pixel concern many applications revealed method robust detect objects, even presence strong shadows highlights. hand refine playing strategy football, handball, volley ball, rugby..., coach need maximum technical-tactics information on-going game players. propose paper range algorithms allowing resolution many problems appearing automated process team identification, player affected corresponding team relying visual data. developed system tested match tunisian national competition. work prominent many next computer vision studies detailed study.",4 "comparative analysis methods estimating axon diameter using dwi. importance studying brain microstructure described existing state art non-invasive methods investigation brain microstructure using diffusion weighted magnetic resonance imaging (dwi) studied. next step, cramer-rao lower bound (crlb) analysis described utilised assessment minimum estimation error uncertainty level different diffusion weighted magnetic resonance (dwmr) signal decay models. analyses performed considering best scenario which, assume models appropriate representation measured phenomena. includes study sensitivity estimations measurement model parameters. demonstrated none existing models achieve reasonable minimum uncertainty level typical measurement setup. end, practical obstacles achieving higher performance clinical experimental environments studied effects feasibility methods discussed.",4 "mip backend idp system. idp knowledge base system currently uses minisat(id) backend constraint programming (cp) solver. similar systems used mixed integer programming (mip) solver backend. however, far little known mip solver preferable. paper explores question. describes use cplex backend idp reports experiments comparing backends.",4 "sensitivity analysis (and practitioners' guide to) convolutional neural networks sentence classification. convolutional neural networks (cnns) recently achieved remarkably strong performance practically important task sentence classification (kim 2014, kalchbrenner 2014, johnson 2014). however, models require practitioners specify exact model architecture set accompanying hyperparameters, including filter region size, regularization parameters, on. currently unknown sensitive model performance changes configurations task sentence classification. thus conduct sensitivity analysis one-layer cnns explore effect architecture components model performance; aim distinguish important comparatively inconsequential design decisions sentence classification. focus one-layer cnns (to exclusion complex models) due comparative simplicity strong empirical performance, makes modern standard baseline method akin support vector machine (svms) logistic regression. derive practical advice extensive empirical results interested getting cnns sentence classification real world settings.",4 "partner units configuration problem: completing picture. partner units problem (pup) acknowledged hard benchmark problem logic programming community various industrial application fields like surveillance, electrical engineering, computer networks railway safety systems. however, computational complexity remained widely unclear far. paper provide missing complexity results making pup better exploitable benchmark testing. furthermore, present quickpup, heuristic search algorithm pup instances outperforms state-of-the-art solving approaches already use real world industrial configuration environments.",4 "opennmt: open-source toolkit neural machine translation. introduce open-source toolkit neural machine translation (nmt) support research model architectures, feature representations, source modalities, maintaining competitive performance, modularity reasonable training requirements.",4 "gf mathematics library. paper devoted present mathematics grammar library, system multilingual mathematical text processing. explain context originated, current design functionality current development goals. also present two prototype services comment possible future applications area artificial mathematics assistants.",4 "need good init. layer-sequential unit-variance (lsuv) initialization - simple method weight initialization deep net learning - proposed. method consists two steps. first, pre-initialize weights convolution inner-product layer orthonormal matrices. second, proceed first final layer, normalizing variance output layer equal one. experiment different activation functions (maxout, relu-family, tanh) show proposed initialization leads learning deep nets (i) produces networks test accuracy better equal standard methods (ii) least fast complex schemes proposed specifically deep nets fitnets (romero et al. (2015)) highway (srivastava et al. (2015)). performance evaluated googlenet, caffenet, fitnets residual nets state-of-the-art, close it, achieved mnist, cifar-10/100 imagenet datasets.",4 "attentional push: augmenting salience shared attention modeling. present novel visual attention tracking technique based shared attention modeling. proposed method models viewer participant activity occurring scene. go beyond image salience instead computing power image region pull attention it, also consider strength regions image push attention region question. use term attentional push refer power image regions direct manipulate attention allocation viewer. attention model presented incorporates attentional push cues standard image salience-based attention modeling algorithms improve ability predict viewers fixate. experimental evaluation validates significant improvements predicting viewers' fixations using proposed methodology static dynamic imagery.",4 "fixed budget dwell time spent scanning electron microscopy optimize image quality?. scanning electron microscopy, achievable image quality often limited maximum feasible acquisition time per dataset. particularly regard three-dimensional large field-of-view imaging, compromise must found high amount shot noise, leads low signal-to-noise ratio, excessive acquisition times. assuming fixed acquisition time per frame, compared three different strategies algorithm-assisted image acquisition scanning electron microscopy. evaluated (1) raster scanning reduced dwell time per pixel followed state-of-the-art denoising algorithm, (2) raster scanning decreased resolution conjunction state-of-the-art super resolution algorithm, (3) sparse scanning approach fixed percentage pixels visited beam combination state-of-the-art inpainting algorithms. additionally, considered increased beam currents strategies. experiments showed sparse scanning using appropriate reconstruction technique superior strategies.",4 "phase tv based convex sets blind deconvolution microscopic images. article, two closed convex sets blind deconvolution problem proposed. blurring functions microscopy symmetric respect origin. therefore, modify phase fourier transform (ft) original image. result blurred image original image ft phase. therefore, set images prescribed ft phase used constraint set blind deconvolution problems. another convex set used image reconstruction process epigraph set total variation (tv) function. set need prescribed upper bound total variation image. upper bound automatically adjusted according current image restoration process. two closed convex sets used part blind deconvolution algorithm. simulation examples presented.",12 "learning peptide-protein binding affinity predictor kernel ridge regression. propose specialized string kernel small bio-molecules, peptides pseudo-sequences binding interfaces. kernel incorporates physico-chemical properties amino acids elegantly generalize eight kernels, oligo, weighted degree, blended spectrum, radial basis function. provide low complexity dynamic programming algorithm exact computation kernel linear time algorithm approximation. combined kernel ridge regression supck, novel binding pocket kernel, proposed kernel yields biologically relevant good prediction accuracy pepx database. first time, machine learning predictor capable accurately predicting binding affinity peptide protein. method also applied single-target pan-specific major histocompatibility complex class ii benchmark datasets three quantitative structure affinity model benchmark datasets. benchmarks, method significantly (p-value < 0.057) outperforms current state-of-the-art methods predicting peptide-protein binding affinities. proposed approach flexible applied predict quantitative biological activity. method value large segment research community potential accelerate peptide-based drug vaccine development.",16 "wide-residual-inception networks real-time object detection. since convolutional neural network(cnn)models emerged,several tasks computer vision actively deployed cnn models feature extraction. however,the conventional cnn models high computational cost require high memory capacity, impractical unaffordable commercial applications real-time on-road object detection embedded boards mobile platforms. tackle limitation cnn models, paper proposes wide-residual-inception (wr-inception) network, constructs architecture based residual inception unit captures objects various sizes feature map, well shallower wider layers, compared state-of-the-art networks like resnet. verify proposed networks, paper conducted two experiments; one classification task cifar-10/100 on-road object detection task using single-shot multi-box detector(ssd) kitti dataset.",4 "augmented artificial intelligence: conceptual framework. artificial intelligence (ai) systems make errors. errors unexpected, differ often typical human mistakes (""non-human"" errors). ai errors corrected without damage existing skills and, hopefully, avoiding direct human expertise. paper presents initial summary report project taking new systematic approach improving intellectual effectiveness individual ai communities ais. combine ideas learning heterogeneous multiagent systems new original mathematical approaches non-iterative corrections errors legacy ai systems. new stochastic separation theorems demonstrate corrector technology used handle errors data flows general probability distributions far away classical i.i.d. hypothesis.in particular, analysis mathematical foundations ai non-destructive correction, answer one general problem published donoho tanner 2009.",4 "automatic photo adjustment using deep neural networks. photo retouching enables photographers invoke dramatic visual impressions artistically enhancing photos stylistic color tone adjustments. however, also time-consuming challenging task requires advanced skills beyond abilities casual photographers. using automated algorithm appealing alternative manual work algorithm faces many hurdles. many photographic styles rely subtle adjustments depend image content even semantics. further, adjustments often spatially varying. characteristics, existing automatic algorithms still limited cover subset challenges. recently, deep machine learning shown unique abilities address hard problems resisted machine algorithms long. motivated us explore use deep learning context photo editing. paper, explain formulate automatic photo adjustment problem way suitable approach. also introduce image descriptor accounts local semantics image. experiments demonstrate deep learning formulation applied using descriptors successfully capture sophisticated photographic styles. particular unlike previous techniques, model local adjustments depend image semantics. show several examples yields results qualitatively quantitatively better previous work.",4 "process-oriented iterative multiple alignment medical process mining. adapted biological sequence alignment, trace alignment process mining technique used visualize analyze workflow data. analysis done method, however, affected alignment quality. best existing trace alignment techniques use progressive guide-trees heuristically approximate optimal alignment o(n2l2) time. algorithms heavily dependent selected guide-tree metric, often return sum-of-pairs-score-reducing errors interfere interpretation, computationally intensive large datasets. alleviate issues, propose process-oriented iterative multiple alignment (pima), contains specialized optimizations better handle workflow data. demonstrate pima flexible framework capable achieving better sum-of-pairs score existing trace alignment algorithms o(nl2) time. applied pima analyzing medical workflow data, showing iterative alignment better represent data facilitate extraction insights data visualization.",4 "multi-domain neural network language generation spoken dialogue systems. moving limited-domain natural language generation (nlg) open domain difficult number semantic input combinations grows exponentially number domains. therefore, important leverage existing resources exploit similarities domains facilitate domain adaptation. paper, propose procedure train multi-domain, recurrent neural network-based (rnn) language generators via multiple adaptation steps. procedure, model first trained counterfeited data synthesised out-of-domain dataset, fine tuned small set in-domain utterances discriminative objective function. corpus-based evaluation results show proposed procedure achieve competitive performance terms bleu score slot error rate significantly reducing data needed train generators new, unseen domains. subjective testing, human judges confirm procedure greatly improves generator performance small amount data available domain.",4 "handwritten digit recognition bio-inspired hierarchical networks. human brain processes information showing learning prediction abilities underlying neuronal mechanisms still remain unknown. recently, many studies prove neuronal networks able generalizations associations sensory inputs. paper, following set neurophysiological evidences, propose learning framework strong biological plausibility mimics prominent functions cortical circuitries. developed inductive conceptual network (icn), hierarchical bio-inspired network, able learn invariant patterns variable-order markov models implemented nodes. outputs top-most node icn hierarchy, representing highest input generalization, allow automatic classification inputs. found icn clusterized mnist images error 5.73% usps images error 12.56%.",4 "differential methods catadioptric sensor design applications panoramic imaging. discuss design techniques catadioptric sensors realize given projections. general, problems solutions, approximate solutions may often found visually acceptable. several methods approach problem, focus call ``vector field approach''. application given true panoramic mirror derived, i.e. mirror yields cylindrical projection viewer without digital unwarping.",4 "survey naïve bayes machine learning approach text document classification. text document classification aims associating one predefined categories based likelihood suggested training set labeled documents. many machine learning algorithms play vital role training system predefined categories among na\""ive bayes intriguing facts simple, easy implement draws better accuracy large datasets spite na\""ive dependence. importance na\""ive bayes machine learning approach felt hence study taken text document classification statistical event models available. survey various feature selection methods discussed compared along metrics related text document classification.",4 "semi-blind sparse image reconstruction application mrfm. propose solution image deconvolution problem convolution kernel point spread function (psf) assumed partially known. small perturbations generated model exploited produce principal components explaining psf uncertainty high dimensional space. unlike recent developments blind deconvolution natural images, assume image sparse pixel basis, natural sparsity arising magnetic resonance force microscopy (mrfm). approach adopts bayesian metropolis-within-gibbs sampling framework. performance bayesian semi-blind algorithm sparse images superior previously proposed semi-blind algorithms alternating minimization (am) algorithm blind algorithms developed natural images. illustrate myopic algorithm real mrfm tobacco virus data.",15 "inferring taxi status using gps trajectories. paper, infer statuses taxi, consisting occupied, non-occupied parked, terms gps trajectory. status information enable urban computing improving city's transportation systems land use planning. solution, first identify extract set effective features incorporating knowledge single trajectory, historical trajectories geographic data like road network. second, parking status detection algorithm devised find parking places (from given trajectory), dividing trajectory segments (i.e., sub-trajectories). third, propose two-phase inference model learn status (occupied non-occupied) point taxi segment. model first uses identified features train local probabilistic classifier carries hidden semi-markov model (hsmm) globally considering long term travel patterns. evaluated method large-scale real-world trajectory dataset generated 600 taxis, showing advantages method baselines.",4 "camera identification grouping images database, based shared noise patterns. previous research showed camera specific noise patterns, so-called prnu-patterns, extracted images related images could found. particular research focus grouping images database, based shared noise pattern identification method cameras. using method described article, groups images, created using camera, could linked large database images. using matlab programming, relevant image noise patterns extracted images much quicker common methods use faster noise extraction filters improvements reduce calculation costs. relating noise patterns, correlation certain threshold value, quickly matched. hereby, database images, groups relating images could linked method could used scan large number images suspect noise patterns.",4 "aba+: assumption-based argumentation preferences. present aba+, new approach handling preferences well known structured argumentation formalism, assumption-based argumentation (aba). aba+, preference information given assumptions incorporated directly attack relation, thus resulting attack reversal. aba+ conservatively extends aba exhibits various desirable features regarding relationship among argumentation semantics well preference handling. also introduce weak contraposition, principle concerning reasoning rules preferences relaxes standard principle contraposition, guaranteeing additional desirable features aba+.",4 "identifying unknown unknowns open world: representations policies guided exploration. predictive models deployed real world may assign incorrect labels instances high confidence. errors unknown unknowns rooted model incompleteness, typically arise mismatch training data cases encountered test time. models blind errors, input oracle needed identify failures. paper, formulate address problem informed discovery unknown unknowns given predictive model unknown unknowns occur due systematic biases training data. propose model-agnostic methodology uses feedback oracle identify unknown unknowns intelligently guide discovery. employ two-phase approach first organizes data multiple partitions based feature similarity instances confidence scores assigned predictive model, utilizes explore-exploit strategy discovering unknown unknowns across partitions. demonstrate efficacy framework varying underlying causes unknown unknowns across various applications. best knowledge, paper presents first algorithmic approach problem discovering unknown unknowns predictive models.",4 "difficulty selecting ising models approximate recovery. paper, consider problem estimating underlying graph associated ising model given number independent identically distributed samples. adopt \emph{approximate recovery} criterion allows number missed edges incorrectly-included edges, contrast widely-studied exact recovery problem. main results provide information-theoretic lower bounds sample complexity graph classes imposing constraints number edges, maximal degree, properties. identify broad range scenarios where, either constant factors logarithmic factors, lower bounds match best known lower bounds exact recovery criterion, several known tight near-tight. hence, cases, approximate recovery similar difficulty exact recovery minimax sense. bounds obtained via modification fano's inequality handling approximate recovery criterion, along suitably-designed ensembles graphs broadly classed two categories: (i) containing graphs contain several isolated edges cliques thus difficult distinguish empty graph; (ii) containing graphs certain groups nodes highly correlated, thus making difficult determine precisely edges connect them. support theoretical results ensembles numerical experiments.",4 "creative robot dance variational encoder. appreciate dance ability people sponta- neously improvise new movements choreographies, sur- rendering music rhythm, inspired cur- rent perceptions sensations previous experiences, deeply stored memory. like human abilities, this, course, challenging reproduce artificial entity robot. recent generations anthropomor- phic robots, so-called humanoids, however, exhibit sophisticated skills raised interest robotic communities design experiment systems devoted automatic dance generation. work, highlight importance model computational creativity behavior dancing robots avoid mere execution preprogrammed dances. particular, exploit deep learning approach allows robot generate real time new dancing move- ments according listened music.",4 "translation ""zur ermittlung eines objektes aus zwei perspektiven mit innerer orientierung"" erwin kruppa (1913). erwin kruppa's 1913 paper, erwin kruppa, ""zur ermittlung eines objektes aus zwei perspektiven mit innerer orientierung"", sitzungsberichte der mathematisch-naturwissenschaftlichen kaiserlichen akademie der wissenschaften, vol. 122 (1913), pp. 1939-1948, may translated ""to determine 3d object two perspective views known inner orientation"", landmark paper computer vision provides first five-point algorithm relative pose estimation. kruppa showed (a finite number solutions for) relative pose two calibrated images rigid object computed five point matches images. kruppa's work also gained attention topic camera self-calibration, presented (maybank faugeras, 1992). since paper still relevant today (more hundred citations within last ten years) paper available online, ordered copy german national library frankfurt provide english translation along german original. also adapt terminology modern jargon provide clarifications (highlighted sans-serif font). historical review geometric computer vision, reader referred recent survey paper (sturm, 2011).",4 "neural autoregressive approach collaborative filtering. paper proposes cf-nade, neural autoregressive architecture collaborative filtering (cf) tasks, inspired restricted boltzmann machine (rbm) based cf model neural autoregressive distribution estimator (nade). first describe basic cf-nade model cf tasks. propose improve model sharing parameters different ratings. factored version cf-nade also proposed better scalability. furthermore, take ordinal nature preferences consideration propose ordinal cost optimize cf-nade, shows superior performance. finally, cf-nade extended deep model, moderately increased computational complexity. experimental results show cf-nade single hidden layer beats previous state-of-the-art methods movielens 1m, movielens 10m, netflix datasets, adding hidden layers improve performance.",4 "depth-width tradeoffs approximating natural functions neural networks. provide several new depth-based separation results feed-forward neural networks, proving various types simple natural functions better approximated using deeper networks shallower ones, even shallower networks much larger. includes indicators balls ellipses; non-linear functions radial respect $l_1$ norm; smooth non-linear functions. also show gaps observed experimentally: increasing depth indeed allows better learning increasing width, training neural networks learn indicator unit ball.",4 "skeleton-based action recognition convolutional neural networks. current state-of-the-art approaches skeleton-based action recognition mostly based recurrent neural networks (rnn). paper, propose novel convolutional neural networks (cnn) based framework action classification detection. raw skeleton coordinates well skeleton motion fed directly cnn label prediction. novel skeleton transformer module designed rearrange select important skeleton joints automatically. simple 7-layer network, obtain 89.3% accuracy validation set ntu rgb+d dataset. action detection untrimmed videos, develop window proposal network extract temporal segment proposals, classified within network. recent pku-mmd dataset, achieve 93.7% map, surpassing baseline large margin.",4 "end-to-end audiovisual speech recognition. several end-to-end deep learning approaches recently presented extract either audio visual features input images audio signals perform speech recognition. however, research end-to-end audiovisual models limited. work, present end-to-end audiovisual model based residual networks bidirectional gated recurrent units (bgrus). best knowledge, first audiovisual fusion model simultaneously learns extract features directly image pixels audio waveforms performs within-context word recognition large publicly available dataset (lrw). model consists two streams, one modality, extract features directly mouth regions raw waveforms. temporal dynamics stream/modality modeled 2-layer bgru fusion multiple streams/modalities takes place via another 2-layer bgru. slight improvement classification rate end-to-end audio-only mfcc-based model reported clean audio conditions low levels noise. presence high levels noise, end-to-end audiovisual model significantly outperforms audio-only models.",4 "maximum principle based algorithms deep learning. continuous dynamical system approach deep learning explored order devise alternative frameworks training algorithms. training recast control problem allows us formulate necessary optimality conditions continuous time using pontryagin's maximum principle (pmp). modification method successive approximations used solve pmp, giving rise alternative training algorithm deep learning. approach advantage rigorous error estimates convergence results established. also show may avoid pitfalls gradient-based methods, slow convergence flat landscapes near saddle points. furthermore, demonstrate obtains favorable initial convergence rate per-iteration, provided hamiltonian maximization efficiently carried - step still need improvement. overall, approach opens new avenues attack problems associated deep learning, trapping slow manifolds inapplicability gradient-based methods discrete trainable variables.",4 "kinship verification videos using spatio-temporal texture features deep learning. automatic kinship verification using facial images relatively new challenging research problem computer vision. consists automatically predicting whether two persons biological kin relation examining facial attributes. existing works extract shallow handcrafted features still face images, approach problem spatio-temporal point view explore use shallow texture features deep features characterizing faces. promising results, especially deep features, obtained benchmark uva-nemo smile database. extensive experiments also show superiority using videos still images, hence pointing important role facial dynamics kinship verification. furthermore, fusion two types features (i.e. shallow spatio-temporal texture features deep features) shows significant performance improvements compared state-of-the-art methods.",4 "prediction sea surface temperature using long short-term memory. letter adopts long short-term memory(lstm) predict sea surface temperature(sst), first attempt, knowledge, use recurrent neural network solve problem sst prediction, make one week one month daily prediction. formulate sst prediction problem time series regression problem. lstm special kind recurrent neural network, introduces gate mechanism vanilla rnn prevent vanished exploding gradient problem. strong ability model temporal relationship time series data handle long-term dependency problem well. proposed network architecture composed two kinds layers: lstm layer full-connected dense layer. lstm layer utilized model time series relationship. full-connected layer utilized map output lstm layer final prediction. explore optimal setting architecture experiments report accuracy coastal seas china confirm effectiveness proposed method. addition, also show online updated characteristics.",4 "recommending agenda: active learning private attributes using matrix factorization. recommender systems leverage user demographic information, age, gender, etc., personalize recommendations better place targeted ads. oftentimes, users volunteer information due privacy concerns, due lack initiative filling online profiles. illustrate new threat recommender learns private attributes users voluntarily disclose them. design passive active attacks solicit ratings strategically selected items, could thus used recommender system pursue hidden agenda. methods based novel usage bayesian matrix factorization active learning setting. evaluations multiple datasets illustrate attacks indeed feasible use significantly fewer rated items static inference methods. importantly, succeed without sacrificing quality recommendations users.",4 "discriminative neural sentence modeling tree-based convolution. paper proposes tree-based convolutional neural network (tbcnn) discriminative sentence modeling. models leverage either constituency trees dependency trees sentences. tree-based convolution process extracts sentences' structural features, features aggregated max pooling. architecture allows short propagation paths output layer underlying feature detectors, enables effective structural feature learning extraction. evaluate models two tasks: sentiment analysis question classification. experiments, tbcnn outperforms previous state-of-the-art results, including existing neural networks dedicated feature/rule engineering. also make efforts visualize tree-based convolution process, shedding light models work.",4 "revealing autonomous system taxonomy: machine learning approach. although internet as-level topology extensively studied past years, little known details taxonomy. ""node"" represent wide variety organizations, e.g., large isp, small private business, university, vastly different network characteristics, external connectivity patterns, network growth tendencies, properties hardly neglect working veracious internet representations simulation environments. paper, introduce radically new approach based machine learning techniques map ases internet natural taxonomy. successfully classify 95.3% ases expected accuracy 78.1%. release community as-level topology dataset augmented with: 1) taxonomy information 2) set attributes used classify ases. believe dataset serve invaluable addition understanding structure evolution internet.",4 "general framework interacting bayes-optimally self-interested agents using arbitrary parametric model model prior. recent advances bayesian reinforcement learning (brl) shown bayes-optimality theoretically achievable modeling environment's latent dynamics using flat-dirichlet-multinomial (fdm) prior. self-interested multi-agent environments, transition dynamics mainly controlled agent's stochastic behavior fdm's independence modeling assumptions hold. result, fdm allow agent's behavior generalized across different states specified using prior domain knowledge. overcome practical limitations fdm, propose generalization brl integrate general class parametric models model priors, thus allowing practitioners' domain knowledge exploited produce fine-grained compact representation agent's behavior. empirical evaluation shows approach outperforms existing multi-agent reinforcement learning algorithms.",4 "new hybrid metric verifying parallel corpora arabic-english. paper discusses new metric applied verify quality translation sentence pairs parallel corpora arabic-english. metric combines two techniques, one based sentence length based compression code length. experiments sample test parallel arabic-english corpora indicate combination two techniques improves accuracy identification satisfactory unsatisfactory sentence pairs compared sentence length compression code length alone. new method proposed research effective filtering noise reducing mis-translations resulting greatly improved quality.",4 "asp minimal entailment rational extension sroel. paper exploit answer set programming (asp) reasoning rational extension sroel-r-t low complexity description logic sroel, underlies owl el ontology language. extended language, typicality operator allowed define concepts t(c) (typical c's) rational semantics. proven instance checking rational entailment polynomial complexity. strengthen rational entailment, paper consider minimal model semantics. show that, arbitrary sroel-r-t knowledge bases, instance checking minimal entailment \pi^p_2-complete. relying small model result, models correspond answer sets suitable asp encoding, exploit answer set preferences (and, particular, asprin framework) reasoning minimal entailment. paper consideration acceptance theory practice logic programming.",4 "using deep learning reveal neural code images primary visual cortex. primary visual cortex (v1) first stage cortical image processing, major effort systems neuroscience devoted understanding encodes information visual stimuli. within v1, many neurons respond selectively edges given preferred orientation: known simple complex cells, well-studied. neurons respond localized center-surround image features. still others respond selectively certain image stimuli, specific features excite unknown. moreover, even simple complex cells-- best-understood v1 neurons-- challenging predict respond natural image stimuli. thus, important gaps understanding v1 encodes images. fill gap, train deep convolutional neural networks predict firing rates v1 neurons response natural image stimuli, find 15% neurons within 10% theoretical limit predictability. well predicted neurons, invert predictor network identify image features (receptive fields) cause v1 neurons spike. addition previously-characterized receptive fields (gabor wavelet center-surround), identify neurons respond predictably higher-level textural image features localized particular region image.",16 "negative results computer vision: perspective. negative result outcome experiment model expected hypothesis hold. despite often overlooked scientific community, negative results results carry value. topic extensively discussed fields social sciences biosciences, less attention paid computer vision community. unique characteristics computer vision, particularly experimental aspect, call special treatment matter. paper, address makes negative results important, disseminated incentivized, lessons learned cognitive vision research regard. further, discuss issues computer vision human vision interaction, experimental design statistical hypothesis testing, explanatory versus predictive modeling, performance evaluation, model comparison, well computer vision research culture.",4 "group event detection varying number group members video surveillance. paper presents novel approach automatic recognition group activities video surveillance applications. propose use group representative handle recognition varying number group members, use asynchronous hidden markov model (ahmm) model relationship people. furthermore, propose group activity detection algorithm handle symmetric asymmetric group activities, demonstrate approach enables detection hierarchical interactions people. experimental results show effectiveness approach.",4 "retinal vasculature segmentation using local saliency maps generative adversarial networks image super resolution. propose image super resolution(isr) method using generative adversarial networks (gans) takes low resolution input fundus image generates high resolution super resolved (sr) image upto scaling factor $16$. facilitates accurate automated image analysis, especially small blurred landmarks pathologies. local saliency maps, define pixel's importance, used define novel saliency loss gan cost function. experimental results show resulting sr images perceptual quality close original images perform better competing methods weigh pixels according importance. used retinal vasculature segmentation, sr images result accuracy levels close obtained using original images.",4 "approximation two-part mdl code. approximation optimal two-part mdl code given data, successive monotonically length-decreasing two-part mdl codes, following properties: (i) computation step may take arbitrarily long; (ii) may know reach optimum, whether reach optimum all; (iii) sequence models generated may monotonically improve goodness fit; (iv) model associated optimum (almost) best goodness fit. express practically interesting goodness fit individual models individual data sets rely kolmogorov complexity.",4 "deductive analogical reasoning semantically embedded knowledge graph. representing knowledge high-dimensional vectors continuous semantic vector space help overcome brittleness incompleteness traditional knowledge bases. present method performing deductive reasoning directly vector space, combining analogy, association, deduction straightforward way step chain reasoning, drawing knowledge diverse sources ontologies.",4 "framework compiling preferences logic programs. introduce methodology framework expressing general preference information logic programming answer set semantics. ordered logic program extended logic program rules named unique terms, preferences among rules given set atoms form < names. ordered logic program transformed second, regular, extended logic program wherein preferences respected, answer sets obtained transformed program correspond preferred answer sets original program. approach allows specification dynamic orderings, preferences appear arbitrarily within program. static orderings (in preferences external logic program) trivial restriction general dynamic case. first, develop specific approach reasoning preferences, wherein preference ordering specifies order rules applied. demonstrate wide range applicability framework showing approaches, among brewka eiter, captured within framework. since result transformations extended logic program, make use existing implementations, dlv smodels. end, developed publicly available compiler front-end programming systems.",4 "generative model volume rendering. present technique synthesize analyze volume-rendered images using generative models. use generative adversarial network (gan) framework compute model large collection volume renderings, conditioned (1) viewpoint (2) transfer functions opacity color. approach facilitates tasks volume analysis challenging achieve using existing rendering techniques ray casting texture-based methods. show guide user transfer function editing quantifying expected change output image. additionally, generative model transforms transfer functions view-invariant latent space specifically designed synthesize volume-rendered images. use space directly rendering, enabling user explore space volume-rendered images. model independent choice volume rendering process, show analyze volume-rendered images produced direct global illumination lighting, variety volume datasets.",4 "speech enhancement using pitch detection approach noisy environment. acoustical mismatch among training testing phases degrades outstandingly speech recognition results. problem limited development real-world nonspecific applications, testing conditions highly variant even unpredictable training process. therefore background noise removed noisy speech signal increase signal intelligibility reduce listener fatigue. enhancement techniques applied, pre-processing stages; systems remarkably improve recognition results. paper, novel approach used enhance perceived quality speech signal additive noise cannot directly controlled. instead controlling background noise, propose reinforce speech signal heard clearly noisy environments. subjective evaluation shows proposed method improves perceptual quality speech various noisy environments. cases speaking may convenient typing, even rapid typists: many mathematical symbols missing keyboard easily spoken recognized. therefore, proposed system used application designed mathematical symbol recognition (especially symbols available keyboard) schools.",4 "dynamic nonlocal language modeling via hierarchical topic-based adaptation. paper presents novel method generating applying hierarchical, dynamic topic-based language models. proposes evaluates new cluster generation, hierarchical smoothing adaptive topic-probability estimation techniques. combined models help capture long-distance lexical dependencies. experiments broadcast news corpus show significant improvement perplexity (10.5% overall 33.5% target vocabulary).",4 "specifying non-markovian rewards mdps using ldl finite traces (preliminary version). markov decision processes (mdps), reward obtained state depends properties last state action. state dependency makes difficult reward interesting long-term behaviors, always closing door opened, providing coffee following request. extending mdps handle non-markovian reward function subject two previous lines work, using variants ltl specify reward function compiling new model back markovian model. building upon recent progress theories temporal logics finite traces, adopt ldlf specifying non-markovian rewards provide elegant automata construction building markovian model, extends previous work offers strong minimality compositionality guarantees.",4 "theory unified relativity biovielectroluminescence phenomenon via fly's visual imaging system. elucidation upon fly's neuronal patterns link computer graphics memory cards i/o's, investigated phenomenon propounding unified theory einstein's two known relativities. conclusive flies could contribute certain amount neuromatrices indicating imagery function visual-computational system computer graphics storage systems. visual system involves time aspect, whereas flies possess faster pulses compared humans' visual ability due e-field state active fly's eye surface. behaviour tested dissected fly specimen ommatidia. electro-optical contacts electrodes wired flesh forming organic emitter layer stimulate light emission, thereby computer circuit. next step applying threshold voltage secondary voltages circuit denoting array essential electrodes bit switch. result, circuit's dormant pulses versus active pulses specimen's area recorded. outcome matrix possesses construction rgb time radicals expressing time problem consumption, allocating time computational algorithms, enhancing technology far beyond. obtained formulation generates consumed distance cons(x), denoting circuital travel data source/sink pixel data bendable wavelengths. 'image logic' place, incorporating point graphical acceleration permits one enhance graphics optimize immensely central processing, data transmissions memory computer visual system. phenomenon mainly used 360-deg. display/viewing, 3d scanning techniques, military medicine, robust cheap substitution e.g. pre-motion pattern analysis, real-time rendering lcds.",4 "highly automated learning improved active safety vulnerable road users. highly automated driving requires precise models traffic participants. many state art models currently based machine learning techniques. among others, required amount labeled data one major challenge. autonomous learning process addressing problem proposed. initial models iteratively refined three steps: (1) detection context identification, (2) novelty detection active learning (3) online model adaption.",4 "novel strategy selection method multi-objective clustering algorithms using game theory. important factors contribute efficiency game-theoretical algorithms time game complexity. study, offered elegant method deal high complexity game theoretic multi-objective clustering methods large-sized data sets. here, developed method selects subset strategies strategies profile player. case, size payoff matrices reduces significantly remarkable impact time complexity. therefore, practical problems data tractable less computational complexity. although strategies set may grow increasing number data points, presented model strategy selection reduces strategy space, considerably, clusters subdivided several sub-clusters local game. remarkable results demonstrate efficiency presented approach reducing computational complexity problem concern.",4 "group theory, group actions, evolutionary algorithms, global optimization. paper use group, action orbit understand evolutionary solve nonconvex optimization problems.",4 "weaving multi-scale context single shot detector. aggregating context information multiple scales proved effective improving accuracy single shot detectors (ssds) object detection. however, existing multi-scale context fusion techniques computationally expensive, unfavorably diminishes advantageous speed ssd. work, propose novel network topology, called weavenet, efficiently fuse multi-scale information boost detection accuracy negligible extra cost. proposed weavenet iteratively weaves context information adjacent scales together enable sophisticated context reasoning maintaining fast speed. built stacking light-weight blocks, weavenet easy train without requiring batch normalization accelerated proposed architecture simplification. experimental results pascal voc 2007, pascal voc 2012 benchmarks show signification performance boost brought weavenet. 320x320 input batch size = 8, weavenet reaches 79.5% map pascal voc 2007 test 101 fps 4 fps extra cost, improves 79.7% map iterations.",4 "trax: visual tracking exchange protocol library. paper address problem developing on-line visual tracking algorithms. present specialized communication protocol serves bridge tracker implementation utilizing application. decouples development algorithms application, encouraging re-usability. primary use case algorithm evaluation protocol facilitates complex evaluation scenarios used nowadays thus pushing forward field visual tracking. present reference implementation protocol makes easy use several popular programming languages discuss protocol already used usage scenarios envision future.",4 "visual representation wittgenstein's tractatus logico-philosophicus. paper present data visualization method together potential usefulness digital humanities philosophy language. compile multilingual parallel corpus different versions wittgenstein's tractatus logico-philosophicus, including original german translations english, spanish, french, russian. using corpus, compute similarity measure propositions render visual network relations different languages.",4 "leveraging large amounts weakly supervised data multi-language sentiment classification. paper presents novel approach multi-lingual sentiment classification short texts. challenging task amount training data languages english limited. previously proposed multi-lingual approaches typically require establish correspondence english powerful classifiers already available. contrast, method require supervision. leverage large amounts weakly-supervised data various languages train multi-layer convolutional network demonstrate importance using pre-training networks. thoroughly evaluate approach various multi-lingual datasets, including recent semeval-2016 sentiment prediction benchmark (task 4), achieved state-of-the-art performance. also compare performance model trained individually language variant trained languages once. show latter model reaches slightly worse - still acceptable - performance compared single language model, benefiting better generalization properties across languages.",4 "learning maps: visual common sense autonomous driving. today's autonomous vehicles rely extensively high-definition 3d maps navigate environment. approach works well maps completely up-to-date, safe autonomous vehicles must able corroborate map's information via real time sensor-based system. goal work develop model road layout inference given imagery on-board cameras, without reliance high-definition maps. however, sufficient dataset training model exists. here, leverage availability standard navigation maps corresponding street view images construct automatically labeled, large-scale dataset complex scene understanding problem. matching road vectors metadata navigation maps google street view images, assign ground truth road layout attributes (e.g., distance intersection, one-way vs. two-way street) images. train deep convolutional networks predict road layout attributes given single monocular rgb image. experimental evaluation demonstrates model learns correctly infer road attributes using panoramas captured car-mounted cameras input. additionally, results indicate method may suitable novel application recommending safety improvements infrastructure (e.g., suggesting alternative speed limit street).",4 "expectation-propagation likelihood-free inference. many models interest natural social sciences closed-form likelihood function, means cannot treated using usual techniques statistical inference. case models efficiently simulated, bayesian inference still possible thanks approximate bayesian computation (abc) algorithm. although many refinements suggested, abc inference still far routine. abc often excruciatingly slow due low acceptance rates. addition, abc requires introducing vector ""summary statistics"", choice relatively arbitrary, often require trial error, making whole process quite laborious user. introduce work ep-abc algorithm, adaptation likelihood-free context variational approximation algorithm known expectation propagation (minka, 2001). main advantage ep-abc faster orders magnitude standard algorithms, producing overall approximation error typically negligible. second advantage ep-abc replaces usual global abc constraint vector summary statistics computed whole dataset, n local constraints form apply separately data-point. consequence, often possible away summary statistics entirely. case, ep-abc approximates directly evidence (marginal likelihood) model. comparisons performed three real-world applications typical likelihood-free inference, including one application neuroscience novel, possibly challenging standard abc techniques.",19 "diffusion-convolutional neural networks. present diffusion-convolutional neural networks (dcnns), new model graph-structured data. introduction diffusion-convolution operation, show diffusion-based representations learned graph-structured data used effective basis node classification. dcnns several attractive qualities, including latent representation graphical data invariant isomorphism, well polynomial-time prediction learning represented tensor operations efficiently implemented gpu. several experiments real structured datasets, demonstrate dcnns able outperform probabilistic relational models kernel-on-graph methods relational node classification tasks.",4 "phase transition sonfis&sorst. study, introduce general frame many connected intelligent particles systems (macips). connections interconnections particles get complex behavior merely simple system (system system).contribution natural computing, information granulation theory, main topics spacious skeleton. upon clue, organize two algorithms involved prominent intelligent computing approximate reasoning methods: self organizing feature map (som), neuro- fuzzy inference system rough set theory (rst). this, show algorithms taken linkage government-society interaction, government catches various fashions behavior: solid (absolute) flexible. so, transition society, changing connectivity parameters (noise) order disorder inferred. add this, one may find indirect mapping among financial systems eventual market fluctuations macips. keywords: phase transition, sonfis, sorst, many connected intelligent particles system, society-government interaction",4 "depth monocular images using semi-parallel deep neural network (spdnn) hybrid architecture. convolutional neural network (cnn) techniques applied problem determining depth single camera image (monocular depth). fully connected cnn topologies preserve details input images, enabling detection fine details, miss larger features; networks employ 2x2, 4x4 8x8 max-pooling operators determine larger features expense finer details. designing, training optimising set topologies, networks may combined single network topology using graph optimization techniques. ""semi parallel deep neural network (spdnn)"" eliminates duplicate common network layers, reducing network size computational effort significantly, optimized retraining achieve improved level convergence individual topologies. study, four models trained evaluated 2 stages kitti dataset. ground truth images first part experiment come benchmark, second part, ground truth images depth map results applying state-of-the-art stereo matching method. results evaluation demonstrate using post-processing techniques refine target network increases accuracy depth estimation individual mono images. second evaluation shows using segmentation data input improve depth estimation results point performance comparable stereo depth estimation. computational time also discussed study.",4 "gaussian process regression student-t likelihood. paper considers robust efficient implementation gaussian process regression student-t observation model. challenge student-t model analytically intractable inference several approximative methods proposed. expectation propagation (ep) found accurate method many empirical studies convergence ep known problematic models containing non-log-concave site functions student-t distribution. paper illustrate situations standard ep fails converge review different modifications alternative algorithms improving convergence. demonstrate convergence problems may occur type-ii maximum posteriori (map) estimation hyperparameters show standard ep may converge map values difficult cases. present robust implementation relies primarily parallel ep updates utilizes moment-matching-based double-loop algorithm adaptively selected step size difficult cases. predictive performance ep compared laplace, variational bayes, markov chain monte carlo approximations.",19 "hyper-heuristics achieve optimal performance pseudo-boolean optimisation. selection hyper-heuristics randomised search methodologies choose execute heuristics set low-level heuristics. recent research leadingones benchmark function shown standard simple random, permutation, random gradient, greedy reinforcement learning selection mechanisms show effects learning. idea behind learning mechanisms continue exploit currently selected heuristic long successful. however, probability promising heuristic successful next step relatively low perturbing reasonable solution combinatorial optimisation problem. paper generalise `simple' selection-perturbation mechanisms success measured fixed period time tau, rather single iteration. present benchmark function necessary learn exploit particular low-level heuristic, rigorously proving makes difference efficient inefficient algorithm. leadingones prove generalised random gradient, generalised greedy gradient hyper-heuristics achieve optimal performance, generalised greedy, although fast, still outperforms random local search. performance former two hyper-heuristics improves number operators choose increases, generalised greedy hyper-heuristic not. experimental analyses confirm results realistic problem sizes shed light best choices parameter tau various situations.",4 "deconvolution layer convolutional layer?. note, want focus aspects related two questions people asked us cvpr network presented. firstly, relationship proposed layer deconvolution layer? secondly, convolutions low-resolution (lr) space better choice? key questions tried answer paper, able go much depth clarity would liked space allowance. better answer questions note, first discuss relationships deconvolution layer forms transposed convolution layer, sub-pixel convolutional layer efficient sub-pixel convolutional layer. refer efficient sub-pixel convolutional layer convolutional layer lr space distinguish common sub-pixel convolutional layer. show fixed computational budget complexity, network convolutions exclusively lr space representation power speed network first upsamples input high resolution space.",4 "sharpened error bounds random sampling based $\ell_2$ regression. given data matrix $x \in r^{n\times d}$ response vector $y \in r^{n}$, suppose $n>d$, costs $o(n d^2)$ time $o(n d)$ space solve least squares regression (lsr) problem. $n$ $d$ large, exactly solving lsr problem expensive. $n \gg d$, one feasible approach speeding lsr randomly embed $y$ columns $x$ smaller subspace $r^c$; induced lsr problem number columns much fewer number rows, solved $o(c d^2)$ time $o(c d)$ space. discuss paper two random sampling based methods solving lsr efficiently. previous work showed leverage scores based sampling based lsr achieves $1+\epsilon$ accuracy $c \geq o(d \epsilon^{-2} \log d)$. paper sharpen error bound, showing $c = o(d \log + \epsilon^{-1})$ enough achieving $1+\epsilon$ accuracy. also show $c \geq o(\mu \epsilon^{-2} \log d)$, uniform sampling based lsr attains $2+\epsilon$ bound positive probability.",4 "asf+ --- eine asf-aehnliche spezifikationssprache. maintaining main aspects algebraic specification language asf presented [bergstra&al.89] extend asf following concepts: exported names asf must stay visible top module hierarchy, asf+ permits sophisticated hiding signature names. erroneous merging distinct structures occurs importing different actualizations parameterized module asf avoided asf+ adequate form parameter binding. new ``namensraum''-concept asf+ permits specifier one hand directly identify origin hidden names decide whether imported module accessed whether important property modified. first case access one single globally provided version; second import copy module. finally asf+ permits semantic conditions parameters specification tasks theorem prover.",4 "propagating uncertainty multi-stage bayesian convolutional neural networks application pulmonary nodule detection. motivated problem computer-aided detection (cad) pulmonary nodules, introduce methods propagate fuse uncertainty information multi-stage bayesian convolutional neural network (cnn) architecture. question seek answer ""can take advantage model uncertainty provided one deep learning model improve performance subsequent deep learning models ultimately overall performance multi-stage bayesian deep learning architecture?"". experiments show propagating uncertainty pipeline enables us improve overall performance terms final prediction accuracy model confidence.",4 "neural networks model venezuelan economy. besides indicator gdp, central bank venezuela generates called monthly economic activity general indicator. priori knowledge indicator, represents sometimes even anticipates economy's fluctuations, could helpful developing public policies investment decision making. purpose study forecasting igaem non parametric methods, approach proven effective wide variety problems economics finance.",4 "framework automated cell tracking phase contrast microscopic videos based normal velocities. paper introduces novel framework automated tracking cells, particular focus challenging situation phase contrast microscopic videos. framework based topology preserving variational segmentation approach applied normal velocity components obtained optical flow computations, appears yield robust tracking automated extraction cell trajectories. order obtain improved trackings local shape features discuss additional correction step based active contours image laplacian optimize example class transformed renal epithelial (mdck-f) cells. also test framework human melanoma cells murine neutrophil granulocytes seeded different types extracellular matrices. results validated manual tracking results.",16 "framework on-line devanagari handwritten character recognition. main challenge on-line handwritten character recognition indian lan- guage large size character set, larger similarity different characters script huge variation writing style. paper propose framework on-line handwitten script recognition taking cues speech signal processing literature. framework based identify- ing strokes, turn lead recognition handwritten on-line characters rather conventional character identification. though framework described devanagari script, framework general applied language. proposed platform consists pre-processing, feature extraction, recog- nition post processing like conventional character recognition ap- plied strokes. on-line devanagari character recognition reduces one recognizing one 69 primitives recognition character performed recognizing sequence primitives. show impact noise removal on-line raw data usually noisy. use fuzzy direc- tional features enhance accuracy stroke recognition also described. recognition results compared commonly used directional features literature using several classifiers.",4 "winning arguments: interaction dynamics persuasion strategies good-faith online discussions. changing someone's opinion arguably one important challenges social interaction. underlying process proves difficult study: hard know someone's opinions formed whether someone's views shift. fortunately, changemyview, active community reddit, provides platform users present opinions reasoning, invite others contest them, acknowledge ensuing discussions change original views. work, study interactions understand mechanisms behind persuasion. find persuasive arguments characterized interesting patterns interaction dynamics, participant entry-order degree back-and-forth exchange. furthermore, comparing similar counterarguments opinion, show language factors play essential role. particular, interplay language opinion holder counterargument provides highly predictive cues persuasiveness. finally, since even favorable setting people may persuaded, investigate problem determining whether someone's opinion susceptible changed all. difficult task, show stylistic choices opinion expressed carry predictive power.",4 "convolutional neural network architectures matching natural language sentences. semantic matching central importance many natural language tasks \cite{bordes2014semantic,retrievalqa}. successful matching algorithm needs adequately model internal structures language objects interaction them. step toward goal, propose convolutional neural network models matching two sentences, adapting convolutional strategy vision speech. proposed models nicely represent hierarchical structures sentences layer-by-layer composition pooling, also capture rich matching patterns different levels. models rather generic, requiring prior knowledge language, hence applied matching tasks different nature different languages. empirical study variety matching tasks demonstrates efficacy proposed model variety matching tasks superiority competitor models.",4 "introduction convolutional neural networks. field machine learning taken dramatic twist recent times, rise artificial neural network (ann). biologically inspired computational models able far exceed performance previous forms artificial intelligence common machine learning tasks. one impressive forms ann architecture convolutional neural network (cnn). cnns primarily used solve difficult image-driven pattern recognition tasks precise yet simple architecture, offers simplified method getting started anns. document provides brief introduction cnns, discussing recently published papers newly formed techniques developing brilliantly fantastic image recognition models. introduction assumes familiar fundamentals anns machine learning.",4 "segmentation classification cine-mr images using fully convolutional networks handcrafted features. three-dimensional cine-mri crucial importance assessing cardiac function. features describe anatomy function cardiac structures (e.g. left ventricle (lv), right ventricle (rv), myocardium(mc)) known significant diagnostic value computed 3d cine-mr images. however, features require precise segmentation cardiac structures. among fully automated segmentation methods, fully convolutional networks (fcn) skip connections shown robustness medical segmentation problems. study, develop complete pipeline classification subjects cardiac conditions based 3d cine-mri. segmentation task, develop 2d fcn introduce parallel paths (pp) way exploit 3d information cine-mr image. classification task, 125 features extracted segmented structures, describing anatomy function. next, two-stage pipeline feature selection using lasso method developed. subset 20 features selected classification. subject classified using ensemble logistic regression, multi-layer perceptron, support vector machine classifiers majority voting. dice coefficient segmentation 0.95+-0.03, 0.89+-0.13, 0.90+-0.03 lv, rv, mc respectively. 8-fold cross validation accuracy classification task 95.05% 92.77% based ground truth proposed methods segmentations respectively. results show pps increase segmentation accuracy, exploiting spatial relations. moreover, classification algorithm features showed discriminability keeping sensitivity segmentation error low possible.",4 "sketch-to-design: context-based part assembly. designing 3d objects scratch difficult, especially user intent fuzzy without clear target form. spirit modeling-by-example, facilitate design providing reference inspiration existing model contexts. rethink model design navigating different possible combinations part assemblies based large collection pre-segmented 3d models. propose interactive sketch-to-design system, user sketches prominent features parts combine. sketched strokes analyzed individually context parts generate relevant shape suggestions via design gallery interface. session progresses parts get selected, contextual cues becomes increasingly dominant system quickly converges final design. key enabler, use pre-learned part-based contextual information allow user quickly explore different combinations parts. experiments demonstrate effectiveness approach efficiently designing new variations existing shapes.",4 "parcellation visual cortex high-resolution histological brain sections using convolutional neural networks. microscopic analysis histological sections considered ""gold standard"" verify structural parcellations human brain. high resolution allows study laminar columnar patterns cell distributions, build important basis simulation cortical areas networks. however, cytoarchitectonic mapping semiautomatic, time consuming process scale high throughput imaging. present automatic approach parcellating histological sections 2um resolution. based convolutional neural network combines topological information probabilistic atlases texture features learned high-resolution cell-body stained images. model applied visual areas trained sparse set partial annotations. show predictions transferable new brains spatially consistent across sections.",4 "tools terminology processing. automatic terminology processing appeared 10 years ago electronic corpora became widely available. processing may statistically linguistically based produces terminology resources used number applications : indexing, information retrieval, technology watch, etc. present tools developed irin institute. take input texts (or collection texts) reflect different states terminology processing: term acquisition, term recognition term structuring.",4 "new point-set registration algorithm fingerprint matching. novel minutia-based fingerprint matching algorithm proposed employs iterative global alignment two minutia sets. matcher considers possible minutia pairings iteratively aligns two sets number minutia pairs exceed maximum number allowable one-to-one pairings. optimal alignment parameters derived analytically via linear least squares. first alignment establishes region overlap two minutia sets, (iteratively) refined successive alignment. alignment, minutia pairs exhibit weak correspondence discarded. process repeated number remaining pairs longer exceeds maximum number allowable one-to-one pairings. proposed algorithm tested fvc2000 fvc2002 databases, results indicate proposed matcher effective efficient fingerprint authentication; fast utilize computationally expensive mathematical functions (e.g. trigonometric, exponential). addition proposed matcher, another contribution paper analytical derivation least squares solution optimal alignment parameters two point-sets lacking exact correspondence.",4 "automatic text extraction character segmentation using maximally stable extremal regions. text detection segmentation important prerequisite many content based image analysis tasks. paper proposes novel text extraction character segmentation algorithm using maximally stable extremal regions basic letter candidates. regions subjected thresholding thereafter various connected components determined identify separate characters. algorithm tested along set various jpeg, png bmp images four different character sets; english, russian, hindi urdu. algorithm gives good results english russian character set; however character segmentation urdu hindi language much accurate. algorithm simple, efficient, involves overhead required training gives good results even low quality images. paper also proposes various challenges text extraction segmentation multilingual inputs.",4 "survey optical character recognition system. optical character recognition (ocr) topic interest many years. defined process digitizing document image constituent characters. despite decades intense research, developing ocr capabilities comparable human still remains open challenge. due challenging nature, researchers industry academic circles directed attentions towards optical character recognition. last years, number academic laboratories companies involved research character recognition increased dramatically. research aims summarizing research far done field ocr. provides overview different aspects ocr discusses corresponding proposals aimed resolving issues ocr.",4 "near-optimal algorithms online matrix prediction. several online prediction problems recent interest comparison class composed matrices bounded entries. example, online max-cut problem, comparison class matrices represent cuts given graph online gambling comparison class matrices represent permutations n teams. another important example online collaborative filtering widely used comparison class set matrices small trace norm. paper isolate property matrices, call (beta,tau)-decomposability, derive efficient online learning algorithm, enjoys regret bound o*(sqrt(beta tau t)) problems comparison class composed (beta,tau)-decomposable matrices. analyzing decomposability cut matrices, triangular matrices, low trace-norm matrices, derive near optimal regret bounds online max-cut, online gambling, online collaborative filtering. particular, resolves (in affirmative) open problem posed abernethy (2010); kleinberg et al (2010). finally, derive lower bounds three problems show upper bounds optimal logarithmic factors. particular, lower bound online collaborative filtering problem resolves another open problem posed shamir srebro (2011).",4 "dr.vae: drug response variational autoencoder. present two deep generative models based variational autoencoders improve accuracy drug response prediction. models, perturbation variational autoencoder semi-supervised extension, drug response variational autoencoder (dr.vae), learn latent representation underlying gene states drug application depend on: (i) drug-induced biological change gene (ii) overall treatment response outcome. vae-based models outperform current published benchmarks field anywhere 3 11% auroc 2 30% aupr. addition, found better reconstruction accuracy necessarily lead improvement classification accuracy jointly trained models perform better models minimize reconstruction error independently.",19 "online convex optimization using predictions. making use predictions crucial, under-explored, area online algorithms. paper studies class online optimization problems external noisy predictions available. propose stochastic prediction error model generalizes prior models learning stochastic control communities, incorporates correlation among prediction errors, captures fact predictions improve time passes. prove achieving sublinear regret constant competitive ratio online algorithms requires use unbounded prediction window adversarial settings, realistic stochastic prediction error models possible use averaging fixed horizon control (afhc) simultaneously achieve sublinear regret constant competitive ratio expectation using constant-sized prediction window. furthermore, show performance afhc tightly concentrated around mean.",4 "cfo: conditional focused neural question answering large-scale knowledge bases. enable computers automatically answer questions like ""who created character harry potter""? carefully built knowledge bases provide rich sources facts. however, remains challenge answer factoid questions raised natural language due numerous expressions one question. particular, focus common questions --- ones answered single fact knowledge base. propose cfo, conditional focused neural-network-based approach answering factoid questions knowledge bases. approach first zooms question find probable candidate subject mentions, infers final answers unified conditional probabilistic framework. powered deep recurrent neural networks neural embeddings, proposed cfo achieves accuracy 75.7% dataset 108k questions - largest public one date. outperforms current state art absolute margin 11.8%.",4 "weak convergence properties constrained emphatic temporal-difference learning constant slowly diminishing stepsize. consider emphatic temporal-difference (td) algorithm, etd($\lambda$), learning value functions stationary policies discounted, finite state action markov decision process. etd($\lambda$) algorithm recently proposed sutton, mahmood, white solve long-standing divergence problem standard td algorithm applied off-policy training, data exploratory policy used evaluate policies interest. almost sure convergence etd($\lambda$) proved recent work general off-policy training conditions, narrow range diminishing stepsize. paper present convergence results constrained versions etd($\lambda$) constant stepsize diminishing stepsize broad range. results characterize asymptotic behavior trajectory iterates produced algorithms, derived combining key properties etd($\lambda$) powerful convergence theorems weak convergence methods stochastic approximation theory. case constant stepsize, addition analyzing behavior algorithms limit stepsize parameter approaches zero, also analyze behavior fixed stepsize bound deviations averaged iterates desired solution. results obtained exploiting weak feller property markov chains associated algorithms, using ergodic theorems weak feller markov chains, conjunction convergence results get weak convergence methods. besides etd($\lambda$), analysis also applies off-policy td($\lambda$) algorithm, divergence issue avoided setting $\lambda$ sufficiently large.",4 "training large scale classifier quantum adiabatic algorithm. previous publication proposed discrete global optimization method train strong binary classifier constructed thresholded sum weak classifiers. motivation cast training classifier format amenable solution quantum adiabatic algorithm. applying adiabatic quantum computing (aqc) promises yield solutions superior achieved classical heuristic solvers. interestingly found using heuristic solvers obtain approximate solutions could already gain advantage standard method adaboost. communication generalize baseline method large scale classifier training. large scale mean either cardinality dictionary candidate weak classifiers number weak learners used strong classifier exceed number variables handled effectively single global optimization. situations propose iterative piecewise approach subset weak classifiers selected iteration via global optimization. strong classifier constructed concatenating subsets weak classifiers. show numerical studies generalized method successfully competes adaboost. also provide theoretical arguments proposed optimization method, minimize empirical loss also adds l0-norm regularization, superior versions boosting minimize empirical loss. conducting quantum monte carlo simulation gather evidence quantum adiabatic algorithm able handle generic training problem efficiently.",18 "electrocardiography separation mother baby. extraction electrocardiography (ecg ekg) signals mother baby challenging task, one single device used receives mixture multiple heart beats. paper, would like design filter separate signals other.",4 "simple proximal stochastic gradient method nonsmooth nonconvex optimization. analyze stochastic gradient algorithms optimizing nonconvex, nonsmooth finite-sum problems. particular, objective function given summation differentiable (possibly nonconvex) component, together possibly non-differentiable convex component. propose proximal stochastic gradient algorithm based variance reduction, called proxsvrg+. algorithm slight variant proxsvrg algorithm [reddi et al., 2016b]. main contribution lies analysis proxsvrg+. recovers several existing convergence results (in terms number stochastic gradient oracle calls proximal operations), improves/generalizes others. particular, proxsvrg+ generalizes best results given scsg algorithm, recently proposed [lei et al., 2017] smooth nonconvex case. proxsvrg+ straightforward scsg yields simpler analysis. moreover, proxsvrg+ outperforms deterministic proximal gradient descent (proxgd) wide range minibatch sizes, partially solves open problem proposed [reddi et al., 2016b]. finally, nonconvex functions satisfied polyak-{\l}ojasiewicz condition, show proxsvrg+ achieves global linear convergence rate without restart. proxsvrg+ always worse proxgd proxsvrg/saga, sometimes outperforms (and generalizes results scsg) case.",12 "random gradient extrapolation distributed stochastic optimization. paper, consider class finite-sum convex optimization problems defined distributed multiagent network $m$ agents connected central server. particular, objective function consists average $m$ ($\ge 1$) smooth components associated network agent together strongly convex term. major contribution develop new randomized incremental gradient algorithm, namely random gradient extrapolation method (rgem), require exact gradient evaluation even initial point, achieve optimal ${\cal o}(\log(1/\epsilon))$ complexity bound terms total number gradient evaluations component functions solve finite-sum problems. furthermore, demonstrate stochastic finite-sum optimization problems, rgem maintains optimal ${\cal o}(1/\epsilon)$ complexity (up certain logarithmic factor) terms number stochastic gradient computations, attains ${\cal o}(\log(1/\epsilon))$ complexity terms communication rounds (each round involves one agent). worth noting former bound independent number agents $m$, latter one linearly depends $m$ even $\sqrt m$ ill-conditioned problems. best knowledge, first time complexity bounds obtained distributed stochastic optimization problems. moreover, algorithms developed based novel dual perspective nesterov's accelerated gradient method.",12 "pooled motion features first-person videos. paper, present new feature representation first-person videos. first-person video understanding (e.g., activity recognition), important capture entire scene dynamics (i.e., egomotion) salient local motion observed videos. describe representation framework based time series pooling, designed abstract short-term/long-term changes feature descriptor elements. idea keep track descriptor values changing time summarize represent motion activity video. framework general, handling types per-frame feature descriptors including conventional motion descriptors like histogram optical flows (hof) well appearance descriptors recent convolutional neural networks (cnn). experimentally confirm approach clearly outperforms previous feature representations including bag-of-visual-words improved fisher vector (ifv) using identical underlying feature descriptors. also confirm feature representation superior performance existing state-of-the-art features like local spatio-temporal features improved trajectory features (originally developed 3rd-person videos) handling first-person videos. multiple first-person activity datasets tested various settings confirm findings.",4 "correspondence insertion as-projective-as-possible image stitching. spatially varying warps increasingly popular image alignment. particular, as-projective-as-possible (apap) warps proven effective accurate panoramic stitching, especially cases significant depth parallax defeat standard homographic warps. however, estimating spatially varying warps requires sufficient number feature matches. image regions feature detection matching fail, warp loses guidance unable accurately model true underlying warp, thus resulting poor registration. paper, propose correspondence insertion method apap warps, focus panoramic stitching. method automatically identifies misaligned regions, inserts appropriate point correspondences increase flexibility warp improve alignment. unlike warp varieties, underlying projective regularization apap warps reduces overfitting geometric distortion, despite increases warp complexity. comparisons recent techniques parallax-tolerant image stitching demonstrate effectiveness simplicity approach.",4 "automatically generating commit messages diffs using neural machine translation. commit messages valuable resource comprehension software evolution, since provide record changes feature additions bug repairs. unfortunately, programmers often neglect write good commit messages. different techniques proposed help programmers automatically writing messages. techniques effective describing changed, often verbose lack context understanding rationale behind change. contrast, humans write messages short summarize high level rationale. paper, adapt neural machine translation (nmt) automatically ""translate"" diffs commit messages. trained nmt algorithm using corpus diffs human-written commit messages top 1k github projects. designed filter help ensure trained algorithm higher-quality commit messages. evaluation uncovered pattern messages generate tend either high low quality. therefore, created quality-assurance filter detect cases unable produce good messages, return warning instead.",4 "detekcja upadku wybranych akcji na sekwencjach obrazów cyfrowych. recent years growing interest action recognition observed, including detection fall accident elderly. however, despite many efforts undertaken, existing technology widely used elderly, mainly flaws like low precision, large number false alarms, inadequate privacy preserving data acquisition processing. research work meets expectations. work empirical situated field computer vision systems. main part work situates area action behavior recognition. efficient algorithms fall detection developed, tested implemented using image sequences wireless inertial sensor worn monitored person. set descriptors depth maps elaborated permit classification pose well action person. experimental research carried based prepared data repository consisting synchronized depth accelerometric data. study carried scenario static camera facing scene active camera observing scene above. experimental results showed developed algorithms fall detection high sensitivity specificity. algorithm designed regard low computational demands possibility run arm platforms. several experiments including person detection, tracking fall detection real-time carried show efficiency reliability proposed solutions.",4 "learning evaluating musical features deep autoencoders. work describe evaluate methods learn musical embeddings. embedding vector represents four contiguous beats music derived symbolic representation. consider autoencoding-based methods including denoising autoencoders, context reconstruction, evaluate resulting embeddings forward prediction classification task.",4 "direction-aware spatial context features shadow detection. shadow detection fundamental challenging task, since requires understanding global image semantics various backgrounds around shadows. paper presents novel network shadow detection analyzing image context direction-aware manner. achieve this, first formulate direction-aware attention mechanism spatial recurrent neural network (rnn) introducing attention weights aggregating spatial context features rnn. learning weights training, recover direction-aware spatial context (dsc) detecting shadows. design developed dsc module embedded cnn learn dsc features different levels. moreover, weighted cross entropy loss designed make training effective. employ two common shadow detection benchmark datasets perform various experiments evaluate network. experimental results show network outperforms state-of-the-art methods achieves 97% accuracy 38% reduction balance error rate.",4 "modeling state software debugging vhdl-rtl designs -- model-based diagnosis approach. paper outline approach applying model-based diagnosis field automatic software debugging hardware designs. present value-level model debugging vhdl-rtl designs show localize erroneous component responsible observed misbehavior. furthermore, discuss extension model supports debugging sequential circuits, given point time, also allows considering temporal behavior vhdl-rtl designs. introduced model capable handling state inherently present every sequential circuit. principal applicability new model outlined briefly use industrial-sized real world examples iscas'85 benchmark suite discuss scalability approach.",4 "supervised texture segmentation: comparative study. paper aims compare four different types feature extraction approaches terms texture segmentation. feature extraction methods used segmentation gabor filters (gf), gaussian markov random fields (gmrf), run-length matrix (rlm) co-occurrence matrix (glcm). shown gf performed best terms quality segmentation glcm localises texture boundaries better compared methods.",4 "facial landmarks detection self-iterative regression based landmarks-attention network. cascaded regression (cr) based methods proposed solve facial landmarks detection problem, learn series descent directions multiple cascaded regressors separately trained coarse fine stages. outperform traditional gradient descent based methods accuracy running speed. however, cascaded regression robust enough regressor's training data comes output previous regressor. moreover, training multiple regressors requires lots computing resources, especially deep learning based methods. paper, develop self-iterative regression (sir) framework improve model efficiency. one self-iterative regressor trained learn descent directions samples coarse stages fine stages, parameters iteratively updated regressor. specifically, proposed landmarks-attention network (lan) regressor, concurrently learns features around landmark obtains holistic location increment. so, rest regressors removed simplify training process, number model parameters significantly decreased. experiments demonstrate 3.72m model parameters, proposed method achieves state-of-the-art performance.",4 "semi-amortized variational autoencoders. amortized variational inference (avi) replaces instance-specific local inference global inference network. avi enabled efficient training deep generative models variational autoencoders (vae), recent empirical work suggests inference networks produce suboptimal variational parameters. propose hybrid approach, use avi initialize variational parameters run stochastic variational inference (svi) refine them. crucially, local svi procedure differentiable, inference network generative model trained end-to-end gradient-based optimization. semi-amortized approach enables use rich generative models without experiencing posterior-collapse phenomenon common training vaes problems like text generation. experiments show approach outperforms strong autoregressive variational baselines standard text image datasets.",19 "3d face reconstruction learning synthetic data. fast robust three-dimensional reconstruction facial geometric structure single image challenging task numerous applications. here, introduce learning-based approach reconstructing three-dimensional face single image. recent face recovery methods rely accurate localization key characteristic points. contrast, proposed approach based convolutional-neural-network (cnn) extracts face geometry directly image. although deep architectures outperform models complex computer vision problems, training properly requires large dataset annotated examples. case three-dimensional faces, currently, large volume data sets, acquiring big-data tedious task. alternative, propose generate random, yet nearly photo-realistic, facial images geometric form known. suggested model successfully recovers facial shapes real images, even faces extreme expressions various lighting conditions.",4 "geometric enclosing networks. training model generate data increasingly attracted research attention become important modern world applications. propose paper new geometry-based optimization approach address problem. orthogonal current state-of-the-art density-based approaches, notably vae gan, present fresh new idea borrows principle minimal enclosing ball train generator g\left(\bz\right) way training generated data, mapped feature space, enclosed sphere. develop theory guarantee mapping bijective inverse feature space data space results expressive nonlinear contours describe data manifold, hence ensuring data generated also lying data manifold learned training data. model enjoys nice geometric interpretation, hence termed geometric enclosing networks (gen), possesses key advantages rivals, namely simple easy-to-control optimization formulation, avoidance mode collapsing efficiently learn data manifold representation completely unsupervised manner. conducted extensive experiments synthesis real-world datasets illustrate behaviors, strength weakness proposed gen, particular ability handle multi-modal data quality generated data.",4 "automatic detection trends dynamical text: evolutionary approach. paper presents evolutionary algorithm modeling arrival dates document streams, time-stamped collection documents, newscasts, e-mails, irc conversations, scientific journals archives weblog postings. algorithm assigns frequencies (number document arrivals per time unit) time intervals produces optimal fit data. optimization trade accurately fitting data avoiding many frequency changes; way analysis able find fits ignore noise. classical dynamic programming algorithms limited memory efficiency requirements, problem dealing long streams. suggests explore alternative search methods allow degree uncertainty achieve tractability. experiments shown designed evolutionary algorithm able reach solution quality classical dynamic programming algorithms shorter time. also explored different probabilistic models optimize fitting date streams, applied algorithms infer whether new arrival increases decreases {\em interest} topic document stream about.",4 "towards reducing multidimensionality olap cubes using evolutionary algorithms factor analysis methods. data warehouses structures large amount data collected heterogeneous sources used decision support system. data warehouses analysis identifies hidden patterns initially unexpected analysis requires great memory computation cost. data reduction methods proposed make analysis easier. paper, present hybrid approach based genetic algorithms (ga) evolutionary algorithms multiple correspondence analysis (mca) analysis factor methods conduct reduction. approach identifies reduced subset dimensions initial subset p p'
\sigma_0$ $\sigma<\sigma_0$ different behaviors: former scales $n^{2/3}$ latter scales $n^{3/4}$.",12 "investigating parameter space evolutionary algorithms. practice evolutionary algorithms involves tuning many parameters. big population be? many generations algorithm run? (tournament selection) tournament size? probabilities one assign crossover mutation? extensive series experiments multiple evolutionary algorithm implementations problems show parameter space tends rife viable parameters, least 25 problems studied herein. discuss implications finding practice.",4 "selection giant radio sources nvss. results application pattern recognition techniques problem identifying giant radio sources (grs) data nvss catalog presented issues affecting process explored. decision-tree pattern recognition software applied training set source pairs developed known nvss large angular size radio galaxies. full training set consisted 51,195 source pairs, 48 known grs lobe primarily represented single catalog component. source pairs maximum separation 20 arc minutes minimum component area 1.87 square arc minutes 1.4 mjy level. importance comparing resulting probability distributions training application sets cases unknown class ratio demonstrated. probability correctly ranking randomly selected (grs, non-grs) pair best tested classifiers determined 97.8 +/- 1.5%. best classifiers applied 870,000 candidate pairs entire catalog. images higher ranked sources visually screened table sixteen hundred candidates, including morphological annotation, presented. systems include doubles triples, wide-angle tail (wat) narrow-angle tail (nat), s- z-shaped systems, core-jets resolved cores. resolved lobe systems recovered technique, generally expected systems would require different approach.",1 "candis: coupled & attention-driven neural distant supervision. distant supervision relation extraction uses heuristically aligned text data existing knowledge base training data. unsupervised nature technique allows scale web-scale relation extraction tasks, expense noise training data. previous work explored relationships among instances entity-pair reduce noise, relationships among instances across entity-pairs fully exploited. explore use inter-instance couplings based verb-phrase entity type similarities. propose novel technique, candis, casts distant supervision using inter-instance coupling end-to-end neural network model. candis incorporates attention module instance-level model multi-instance nature problem. candis outperforms existing state-of-the-art techniques standard benchmark dataset.",4 "advances self organising maps. self-organizing map (som) related extensions popular artificial neural algorithm use unsupervised learning, clustering, classification data visualization. 5,000 publications reported open literature, many commercial projects employ som tool solving hard real-world problems. two years, ""workshop self-organizing maps"" (wsom) covers new developments field. wsom series conferences initiated 1997 prof. teuvo kohonen, successfully organized 1997 1999 helsinki university technology, 2001 university lincolnshire humberside, 2003 kyushu institute technology. universit\'{e} paris panth\'{e}on sorbonne (samos-matisse research centre) organized wsom 2005 paris september 5-8, 2005.",4 "spike slab gaussian process latent variable models. gaussian process latent variable model (gp-lvm) popular approach non-linear probabilistic dimensionality reduction. one design choice model number latent variables. present spike slab prior gp-lvm propose efficient variational inference procedure gives lower bound log marginal likelihood. new model provides principled approach selecting latent dimensions standard way thresholding length-scale parameters. effectiveness approach demonstrated experiments real simulated data. further, extend multi-view gaussian processes rely sharing latent dimensions (known manifold relevance determination) spike slab priors. allows principled approach selecting subset latent space view data. extended model outperforms previous state-of-the-art applied cross-modal multimedia retrieval task.",19 "loop descriptor: local optimal oriented pattern. letter introduces loop binary descriptor (local optimal oriented pattern) encodes rotation invariance main formulation itself. makes post processing stage rotation invariance redundant improves accuracy time complexity. consider fine-grained lepidoptera (moth/butterfly) species recognition representative problem since involves repetition localized patterns textures may exploited discrimination. evaluate performance loop predecessors well popular descriptors. besides experiments standard benchmarks, also introduce new small image dataset nz lepidoptera. loop performs well better datasets evaluated compared previous binary descriptors. new dataset demo code proposed method made available lead author's academic webpage github.",4 "learning discriminative model perception realism composite images. makes image appear realistic? work, answering question data-driven perspective learning perception visual realism directly large amounts data. particular, train convolutional neural network (cnn) model distinguishes natural photographs automatically generated composite images. model learns predict visual realism scene terms color, lighting texture compatibility, without human annotations pertaining it. model outperforms previous works rely hand-crafted heuristics, task classifying realistic vs. unrealistic photos. furthermore, apply learned model compute optimal parameters compositing method, maximize visual realism score predicted cnn model. demonstrate advantage existing methods via human perception study.",4 "method stopping active learning based stabilizing predictions need user-adjustable stopping. survey existing methods stopping active learning (al) reveals needs methods are: widely applicable; aggressive saving annotations; stable across changing datasets. new method stopping al based stabilizing predictions presented addresses needs. furthermore, stopping methods required handle broad range different annotation/performance tradeoff valuations. despite this, existing body work dominated conservative methods little (if any) attention paid providing users control behavior stopping methods. proposed method shown fill gap level aggressiveness available stopping al supports providing users control stopping behavior.",4 "understanding social cascading geekspeak upshots social cognitive systems. barring swarm robotics, substantial share current machine-human machine-machine learning interaction mechanisms developed fed results agent-based computer simulations, game-theoretic models, robotic experiments based dyadic communication pattern. yet, real life, humans less frequently communicate groups, gain knowledge take decisions basing information cumulatively gleaned one single source. properties taken consideration design autonomous artificial cognitive systems construed interact learn one contact 'neighbour'. end, significant practical import gleaned research applying strict science methodology human social phenomena, e.g. discovery realistic creativity potential spans, 'exposure thresholds' new information could accepted cognitive agent. results presented project analysing social propagation neologisms microblogging service. local, low-level interactions information flows agents inventing imitating discrete lexemes aim describe processes emergence global systemic order dynamics, using latest methods complexity science. whether order mimic them, 'enhance' them, parameters gleaned complexity science approaches humans' social humanistic behaviour subsequently incorporated points reference field robotics human-machine interaction.",4 "real-time halfway domain reconstruction motion geometry. present novel approach real-time joint reconstruction 3d scene motion geometry binocular stereo videos. approach based novel variational halfway-domain scene flow formulation, allows us obtain highly accurate spatiotemporal reconstructions shape motion. solve underlying optimization problem real-time frame rates using novel data-parallel robust non-linear optimization strategy. fast convergence large displacement flows achieved employing novel hierarchy stores delta flows hierarchy levels. high performance obtained introduction coarser warp grid decouples number unknowns input resolution images. demonstrate approach live setup based two commodity webcams, well publicly available video data. extensive experiments evaluations show approach produces high-quality dense reconstructions 3d geometry scene flow real-time frame rates, compares favorably state art.",4 "adversarial networks prostate cancer detection. large number trainable parameters deep neural networks renders inherently data hungry. characteristic heavily challenges medical imaging community make things even worse, many imaging modalities ambiguous nature leading rater-dependant annotations current loss formulations fail capture. propose employing adversarial training segmentation networks order alleviate aforementioned problems. learn segment aggressive prostate cancer utilizing challenging mri images 152 patients show proposed scheme superior de facto standard terms detection sensitivity dice-score aggressive prostate cancer. achieved relative gains shown particularly pronounced small dataset limit.",4 "striving simplicity: convolutional net. modern convolutional neural networks (cnns) used object recognition built using principles: alternating convolution max-pooling layers followed small number fully connected layers. re-evaluate state art object recognition small images convolutional networks, questioning necessity different components pipeline. find max-pooling simply replaced convolutional layer increased stride without loss accuracy several image recognition benchmarks. following finding -- building recent work finding simple network structures -- propose new architecture consists solely convolutional layers yields competitive state art performance several object recognition datasets (cifar-10, cifar-100, imagenet). analyze network introduce new variant ""deconvolution approach"" visualizing features learned cnns, applied broader range network structures existing approaches.",4 "probabilistic tools analysis randomized optimization heuristics. chapter collects several probabilistic tools proved useful analysis randomized search heuristics. includes classic material like markov, chebyshev chernoff inequalities, also lesser known topics like stochastic domination coupling chernoff bounds geometrically distributed random variables negatively correlated random variables. almost results presented appeared previously, some, however, recent conference publications. focus collecting tools analysis randomized search heuristics, many may useful well analysis classic randomized algorithms discrete random structures.",4 "relation color image denoising classification. large amount image denoising literature focuses single channel images often experimentally validates proposed methods tens images most. paper, investigate interaction denoising classification large scale dataset. inspired classification models, propose novel deep learning architecture color (multichannel) image denoising report thousands images imagenet dataset well commonly used imagery. study importance (sufficient) training data, semantic class information traded improved denoising results. result, method greatly improves psnr performance 0.34 - 0.51 db average state-of-the art methods large scale dataset. conclude beneficial incorporate classification models. hand, also study noise affect classification performance. end, come number interesting conclusions, counter-intuitive.",4 "explainable entity-based recommendations knowledge graphs. explainable recommendation important task. many methods proposed generate explanations content reviews written items. review text unavailable, generating explanations still hard problem. paper, illustrate explanations generated scenario leveraging external knowledge form knowledge graphs. method jointly ranks items knowledge graph entities using personalized pagerank procedure produce recommendations together explanations.",4 "sambaten: sampling-based batch incremental tensor decomposition. tensor decompositions invaluable tools analyzing multimodal datasets. many real-world scenarios, datasets far static, contrary tend grow time. instance, online social network setting, observe new interactions time, dataset gets updated ""time"" mode. maintain valid accurate tensor decomposition dynamically evolving multimodal dataset, without re-compute entire decomposition every single update? paper introduce sambaten, sampling-based batch incremental tensor decomposition algorithm, incrementally maintains decomposition given new updates tensor dataset. sambaten able scale datasets state-of-the-art incremental tensor decomposition unable operate on, due ability effectively summarize existing tensor incoming updates, perform computations reduced summary space. extensively evaluate sambaten using synthetic real datasets. indicatively, sambaten achieves comparable accuracy state-of-the-art incremental non-incremental techniques, 25-30 times faster. furthermore, sambaten scales large sparse dense dynamically evolving tensors dimensions 100k x 100k x 100k state-of-the-art incremental approaches able operate.",19 "improving lexical choice neural machine translation. explore two solutions problem mistranslating rare words neural machine translation. first, argue standard output layer, computes inner product vector representing context possible output word embeddings, rewards frequent words disproportionately, propose fix norms vectors constant value. second, integrate simple lexical module jointly trained rest model. evaluate approaches eight language pairs data sizes ranging 100k 8m words, achieve improvements +4.5 bleu, surpassing phrase-based translation nearly settings.",4 "spin glass models syntax language evolution. using sswl database syntactic parameters world languages, mit media lab data language interactions, construct spin glass model language evolution. treat binary syntactic parameters spin states, languages vertices graph, assigned interaction energies along edges. study rough model syntax evolution, assumption strong interaction energy tends cause parameters align, case ferromagnetic materials. also study spin glass model needs modified account entailment relations syntactic parameters. modification leads naturally generalization potts models external magnetic field, consists coupling vertices ising model potts model q=3, edge interactions. describe results simulations dynamics models, different temperature energy regimes. discuss linguistic interpretation parameters physical model.",4 "denoising gravitational waves using deep learning recurrent denoising autoencoders. gravitational wave astronomy rapidly growing field modern astrophysics, observations made frequently ligo detectors. gravitational wave signals often extremely weak data detectors, ligo, contaminated non-gaussian non-stationary noise, often containing transient disturbances obscure real signals. traditional denoising methods, principal component analysis dictionary learning, optimal dealing non-gaussian noise, especially low signal-to-noise ratio gravitational wave signals. furthermore, methods computationally expensive large datasets. overcome issues, apply state-of-the-art signal processing techniques, based recent groundbreaking advancements deep learning, denoise gravitational wave signals embedded either gaussian noise real ligo noise. introduce smtdae, staired multi-timestep denoising autoencoder, based sequence-to-sequence bi-directional long-short-term-memory recurrent neural networks. demonstrate advantages using unsupervised deep learning approach show that, training using simulated gaussian noise, smtdae achieves superior recovery performance gravitational wave signals embedded real non-gaussian ligo noise.",7 "machine learning approach opinion holder extraction arabic language. opinion mining aims extracting useful subjective information reliable amounts text. opinion mining holder recognition task considered yet arabic language. task essentially requires deep understanding clauses structures. unfortunately, lack robust, publicly available, arabic parser complicates research. paper presents leading research opinion holder extraction arabic news independent lexical parsers. investigate constructing comprehensive feature set compensate lack parsing structural outcomes. proposed feature set tuned english previous works coupled proposed semantic field named entities features. feature analysis based conditional random fields (crf) semi-supervised pattern recognition techniques. different research models evaluated via cross-validation experiments achieving 54.03 f-measure. publicly release research outcome corpus lexicon opinion mining community encourage research.",4 "dynamic safe interruptibility decentralized multi-agent reinforcement learning. reinforcement learning, agents learn performing actions observing outcomes. sometimes, desirable human operator \textit{interrupt} agent order prevent dangerous situations happening. yet, part learning process, agents may link interruptions, impact reward, specific states deliberately avoid them. situation particularly challenging multi-agent context agents might learn past interruptions, also agents. orseau armstrong defined \emph{safe interruptibility} one learner, work naturally extend multi-agent systems. paper introduces \textit{dynamic safe interruptibility}, alternative definition suited decentralized learning problems, studies notion two learning frameworks: \textit{joint action learners} \textit{independent learners}. give realistic sufficient conditions learning algorithm enable dynamic safe interruptibility case joint action learners, yet show conditions sufficient independent learners. show however agents detect interruptions, possible prune observations ensure dynamic safe interruptibility even independent learners.",4 "sparse partial least squares on-line variable selection multivariate data streams. paper propose computationally efficient algorithm on-line variable selection multivariate regression problems involving high dimensional data streams. algorithm recursively extracts latent factors partial least squares solution selects important variables factor. achieved means one sparse singular value decomposition efficiently updated on-line adaptive fashion. simulation results based artificial data streams demonstrate algorithm able select important variables dynamic settings correlation structure among observed streams governed hidden components importance variable changes time. also report application algorithm multivariate version ""enhanced index tracking"" problem using financial data streams. application consists performing on-line asset allocation objective overperforming two benchmark indices simultaneously.",19 "regularized auto-encoders learn sparse representation?. authors batch normalization (bn) identify address important problem involved training deep networks-- \textit{internal covariate shift}-- current solution certain drawbacks. instance, bn depends batch statistics layerwise input normalization training makes estimates mean standard deviation input (distribution) hidden layers inaccurate due shifting parameter values (especially initial training epochs). another fundamental problem bn cannot used batch-size $ 1 $ training. address drawbacks bn proposing non-adaptive normalization technique removing covariate shift, call \textit{normalization propagation}. approach depend batch statistics, rather uses data-independent parametric estimate mean standard-deviation every layer thus computationally faster compared bn. exploit observation pre-activation rectified linear units follow gaussian distribution deep networks, first second order statistics given dataset normalized, forward propagate normalization without need recalculating approximate statistics hidden layers.",19 "max-sum algorithm training discrete neural networks. present efficient learning algorithm problem training neural networks discrete synapses, well-known hard (np-complete) discrete optimization problem. algorithm variant so-called max-sum (ms) algorithm. particular, show how, bounded integer weights $q$ distinct states independent concave priori distribution (e.g. $l_{1}$ regularization), algorithm's time complexity made scale $o\left(n\log n\right)$ per node update, thus putting par alternative schemes, belief propagation (bp), without resorting approximations. two special cases particular interest: binary synapses $w\in\{-1,1\}$ ternary synapses $w\in\{-1,0,1\}$ $l_{0}$ regularization. algorithm present performs well bp binary perceptron learning problems, may better suited address problem fully-connected two-layer networks, since inherent symmetries two layer networks naturally broken using ms approach.",3 "bayesian rose trees. hierarchical structure ubiquitous data across many domains. many hierarchical clustering methods, frequently used domain experts, strive discover structure. however, methods limit discoverable hierarchies binary branching structure. limitation, computationally convenient, often undesirable. paper explore bayesian hierarchical clustering algorithm produce trees arbitrary branching structure node, known rose trees. interpret trees mixtures partitions data set, use computationally efficient, greedy agglomerative algorithm find rose trees high marginal likelihood given data. lastly, perform experiments demonstrate rose trees better models data typical binary trees returned hierarchical clustering algorithms.",4 "leaf recognition algorithm plant classification using probabilistic neural network. paper, employ probabilistic neural network (pnn) image data processing techniques implement general purpose automated leaf recognition algorithm. 12 leaf features extracted orthogonalized 5 principal variables consist input vector pnn. pnn trained 1800 leaves classify 32 kinds plants accuracy greater 90%. compared approaches, algorithm accurate artificial intelligence approach fast execution easy implementation.",4 "cloudcv: large scale distributed computer vision cloud service. witnessing proliferation massive visual data. unfortunately scaling existing computer vision algorithms large datasets leaves researchers repeatedly solving algorithmic, logistical, infrastructural problems. goal democratize computer vision; one computer vision, big data distributed computing expert access state-of-the-art distributed computer vision algorithms. present cloudcv, comprehensive system provide access state-of-the-art distributed computer vision algorithms cloud service web interface apis.",4 "holistic interstitial lung disease detection using deep convolutional neural networks: multi-label learning unordered pooling. accurately predicting detecting interstitial lung disease (ild) patterns given computed tomography (ct) slice without pre-processing prerequisites, manually delineated regions interest (rois), clinically desirable, yet challenging goal. majority existing work relies manually-provided ild rois extract sampled 2d image patches ct slices and, there, performs patch-based ild categorization. acquiring manual rois labor intensive serves bottleneck towards fully-automated ct imaging ild screening large-scale populations. furthermore, despite considerable high frequency one ild pattern single ct slice, previous works designed detect one ild pattern per slice patch. tackle two critical challenges, present multi-label deep convolutional neural networks (cnns) detecting ilds holistic ct slices (instead rois sub-images). conventional single-labeled cnn models augmented cope possible presence multiple ild pattern labels, via 1) continuous-valued deep regression based robust norm loss functions 2) categorical objective sum element-wise binary logistic losses. methods evaluated validated using publicly available database 658 patient ct scans five-fold cross-validation, achieving promising performance detecting four major ild patterns: ground glass, reticular, honeycomb, emphysema. also investigate effectiveness cnn activation-based deep-feature encoding scheme using fisher vector encoding, treats ild detection spatially-unordered deep texture classification.",4 "top 10 topics machine learning revisited: quantitative meta-study. topics machine learning commonly addressed research? question initially answered 2007 qualitative survey among distinguished researchers. study, revisit question quantitative perspective. concretely, collect 54k abstracts papers published 2007 2016 leading machine learning journals conferences. use machine learning order determine top 10 topics machine learning. include models, provide holistic view across optimization, data, features, etc. quantitative approach allows reducing bias surveys. reveals new up-to-date insights 10 prolific topics machine learning research are. allows researchers identify popular topics well new rising topics research.",4 "modelling analysis temporal preference drifts using component-based factorised latent approach. changes user preferences originate substantial reasons, like personality shift, transient circumstantial ones, like seasonal changes item popularities. disregarding temporal drifts modelling user preferences result unhelpful recommendations. moreover, different temporal patterns associated various preference domains, preference components combinations. components comprise preferences features, preferences feature values, conditional dependencies features, socially-influenced preferences, bias. example, movies domain, user change rating behaviour (bias shift), preference genre language (feature preference shift), start favouring drama comedy (feature value preference shift). paper, first propose novel latent factor model capture domain-dependent component-specific temporal patterns preferences. component-based approach followed modelling aspects preferences temporal effects enables us arbitrarily switch components off. evaluate proposed method three popular recommendation datasets show significantly outperforms accurate state-of-the-art static models. experiments also demonstrate greater robustness stability proposed dynamic model comparison successful models date. also analyse temporal behaviour different preference components combinations show dynamic behaviour preference components highly dependent preference dataset domain. therefore, results also highlight importance modelling temporal effects also underline advantages component-based architecture better suited capture domain-specific balances contributions aspects.",4 "data city indicators: knowledge graph supporting automatic generation dashboards. context smart cities, indicator definitions used calculate values enable comparison among different cities. calculation indicator values challenges calculation may need combine aspects quality addressing different levels abstraction. knowledge graphs (kgs) used successfully support flexible representation, support improved understanding data analysis similar settings. paper presents operational description city kg, indicator ontology support indicator discovery data visualization application capable performing metadata analysis automatically build display dashboards according discovered indicators. describe implementation urban mobility setting.",4 "neuro-fuzzy technique implementing half-adder circuit using canfis model. neural network, general, considered good solver mathematical binary arithmetic problems. however, networks developed problems xor circuit. paper presents technique implementation half-adder circuit using coactive neuro-fuzzy inference system (canfis) model attempts solve problem using neurosolutions 5 simulator. paper gives experimental results along interpretations possible applications technique.",4 "accelerated optimization pde framework: formulations active contour case. following seminal work nesterov, accelerated optimization methods used powerfully boost performance first-order, gradient-based parameter estimation scenarios second-order optimization strategies either inapplicable impractical. accelerated gradient descent converge considerably faster traditional gradient descent, also performs robust local search parameter space initially overshooting oscillating back settles final configuration, thereby selecting local minimizers basis attraction large enough contain initial overshoot. behavior made accelerated stochastic gradient search methods particularly popular within machine learning community. recent pnas 2016 paper, wibisono, wilson, jordan demonstrate broad class accelerated schemes cast variational framework formulated around bregman divergence, leading continuum limit ode's. show formulation may extended infinite dimension manifolds (starting geometric space curves surfaces) substituting bregman divergence inner products tangent space explicitly introducing distributed mass model evolves conjunction object interest optimization process. co-evolving mass model, introduced purely sake endowing optimization helpful dynamics, also links resulting class accelerated pde based optimization schemes fluid dynamical formulations optimal mass transport.",4 "outlier-robust moment-estimation via sum-of-squares. develop efficient algorithms estimating low-degree moments unknown distributions presence adversarial outliers. guarantees algorithms improve many cases significantly best previous ones, obtained recent works diakonikolas et al, lai et al, charikar et al. also show guarantees algorithms match information-theoretic lower-bounds class distributions consider. improved guarantees allow us give improved algorithms independent component analysis learning mixtures gaussians presence outliers. algorithms based standard sum-of-squares relaxation following conceptually-simple optimization problem: among distributions whose moments bounded way unknown distribution, find one closest statistical distance empirical distribution adversarially-corrupted sample.",4 "classification data contamination application remote sensing image mis-registration. work motivated problem image mis-registration remote sensing interested determining resulting loss accuracy pattern classification. statistical formulation given propose use data contamination model understand phenomenon image mis-registration. model widely applicable many types errors well, example, measurement errors gross errors etc. impact data contamination classification studied statistical learning theoretical framework. closed-form asymptotic bound established resulting loss classification accuracy, less $\epsilon/(1-\epsilon)$ data contamination amount $\epsilon$. bound sharper similar bounds domain adaptation literature and, unlike bounds, applies classifiers infinite vapnik-chervonekis (vc) dimension. extensive simulations conducted synthetic real datasets various types data contamination, including label flipping, feature swapping replacement feature values data generated random source gaussian cauchy distribution. simulation results show bound derive fairly tight.",19 "optimal change point detection gaussian processes. study problem detecting change mean one-dimensional gaussian process data. problem investigated setting increasing domain (customarily employed time series analysis) setting fixed domain (typically arising spatial data analysis). propose detection method based generalized likelihood ratio test (glrt), show method achieves nearly asymptotically optimal rate minimax sense, settings. salient feature proposed method exploits efficient way data dependence captured gaussian process covariance structure. covariance known, propose plug-in glrt method derive conditions method remains asymptotically near optimal. contrast, standard cusum method, account covariance structure, shown asymptotically optimal increasing domain. algorithms accompanying theory applicable wide variety covariance structures, including matern class, powered exponential class, others. plug-in glrt method shown perform well maximum likelihood estimators dense covariance matrix.",12 "generalizing consistency constraint properties quantified constraints. quantified constraints quantified boolean formulae typically much difficult reason classical constraints, quantifier alternation makes usual notion solution inappropriate. consequence, basic properties constraint satisfaction problems (csp), consistency substitutability, completely understood quantified case. properties important basis reasoning methods used solve classical (existentially quantified) constraints, one would like benefit similar reasoning methods resolution quantified constraints. paper, show properties used solvers csp generalized quantified csp. requires re-thinking number basic concepts; particular, propose notion outcome generalizes classical notion solution definitions based. propose systematic study relations hold properties, well complexity results regarding decision properties. finally, since problems typically intractable, generalize approach used csp propose weaker, easier check notions based locality, allow detect properties incompletely polynomial time.",4 "general queries less generalization error adaptive data analysis. adaptivity important feature data analysis---typically choice questions asked dataset depends previous interactions dataset. however, generalization error typically bounded non-adaptive model, questions specified dataset drawn. recent work dwork et al. (stoc '15) hardt ullman (focs '14) initiated formal study problem, gave first upper lower bounds achievable generalization error adaptive data analysis. specifically, suppose unknown distribution $\mathcal{p}$ set $n$ independent samples $x$ drawn $\mathcal{p}$. seek algorithm that, given $x$ input, ""accurately"" answers sequence adaptively chosen ""queries"" unknown distribution $\mathcal{p}$. many samples $n$ must draw distribution, function type queries, number queries, desired level accuracy? work make two new contributions towards resolving question: *we give upper bounds number samples $n$ needed answer statistical queries improve bounds dwork et al. *we prove first upper bounds number samples required answer general families queries. include arbitrary low-sensitivity queries important class convex risk minimization queries. dwork et al., algorithms based connection differential privacy generalization error, feel analysis simpler modular, may useful studying questions future.",4 "clever elimination strategy efficient minimal solvers. present new insight systematic generation minimal solvers computer vision, leads smaller faster solvers. many minimal problem formulations coupled sets linear polynomial equations image measurements enter linear equations only. show useful solve systems first eliminating unknowns appear linear equations extending solutions rest unknowns. generalized fully non-linear systems linearization via lifting. demonstrate approach leads efficient solvers three problems partially calibrated relative camera pose computation unknown focal length and/or radial distortion. approach also generates new interesting constraints fundamental matrices partially calibrated cameras, known before.",4 "imitating me? unsupervised sparse modeling group activity analysis single video. framework unsupervised group activity analysis single video presented. working hypothesis human actions lie union low-dimensional subspaces, thus efficiently modeled sparse linear combinations atoms learned dictionary representing action's primitives. contrary prior art, primary goal spatio-temporal action grouping, work one single video segment available unsupervised learning analysis without prior training information. extracting simple features single spatio-temporal scale, learn dictionary individual video short time lapse. dictionaries allow us compare individuals' actions producing affinity matrix contains sufficient discriminative information actions scene leading grouping simple efficient tools. diverse publicly available real videos, demonstrate effectiveness proposed framework robustness cluttered backgrounds, changes human appearance, action variability.",4 "discrimination discovery removal ranked data using causal graph. predictive models learned historical data widely used help companies organizations make decisions. however, may digitally unfairly treat unwanted groups, raising concerns fairness discrimination. paper, study fairness-aware ranking problem aims discover discrimination ranked datasets reconstruct fair ranking. existing methods fairness-aware ranking mainly based statistical parity cannot measure true discriminatory effect since discrimination causal. hand, existing methods causal-based anti-discrimination learning focus classification problems cannot directly applied handle ranked data. address limitations, propose map rank position continuous score variable represents qualification candidates. then, build causal graph consists discrete profile attributes continuous score. path-specific effect technique extended mixed-variable causal graph identify direct indirect discrimination. relationship path-specific effects ranked data binary decision theoretically analyzed. finally, algorithms discovering removing discrimination ranked dataset developed. experiments using real dataset show effectiveness approaches.",4 "solving linear equations using jacobi based time-variant adaptive hybrid evolutionary algorithm. large set linear equations, especially sparse structured coefficient (matrix) equations, solutions using classical methods become arduous. evolutionary algorithms mostly used solve various optimization learning problems. recently, hybridization classical methods (jacobi method gauss-seidel method) evolutionary computation techniques successfully applied linear equation solving. hybrid evolutionary methods, uniform adaptation (ua) techniques used adapt relaxation factor. paper, new jacobi based time-variant adaptive (jbtva) hybrid evolutionary algorithm proposed. algorithm, time-variant adaptive (tva) technique relaxation factor introduced aiming improving fine local tuning reducing disadvantage uniform adaptation relaxation factors. algorithm integrates jacobi based sr method time variant adaptive evolutionary algorithm. convergence theorems proposed algorithm proved theoretically. performance proposed algorithm compared jbua hybrid evolutionary algorithm classical methods experimental domain. proposed algorithm outperforms jbua hybrid algorithm classical methods terms convergence speed effectiveness.",4 "large-scale image retrieval attentive deep local features. propose attentive local feature descriptor suitable large-scale image retrieval, referred delf (deep local feature). new feature based convolutional neural networks, trained image-level annotations landmark image dataset. identify semantically useful local features image retrieval, also propose attention mechanism keypoint selection, shares network layers descriptor. framework used image retrieval drop-in replacement keypoint detectors descriptors, enabling accurate feature matching geometric verification. system produces reliable confidence scores reject false positives---in particular, robust queries correct match database. evaluate proposed descriptor, introduce new large-scale dataset, referred google-landmarks dataset, involves challenges database query background clutter, partial occlusion, multiple landmarks, objects variable scales, etc. show delf outperforms state-of-the-art global local descriptors large-scale setting significant margins. code dataset found project webpage: https://github.com/tensorflow/models/tree/master/research/delf .",4 "better global polynomial approximation image rectification. using images locate objects, problem correcting distortion misalignment images. elegant way solving problem generate error correcting function maps points image corrected locations. generate function fitting polynomial set sample points. objective identify polynomial passes ""sufficiently close"" points ""good"" approximation intermediate points. past, difficult achieve good global polynomial approximation using sample points. report development global polynomial approximation algorithm solving problem. key words: polynomial approximation, interpolation, image rectification.",4 "answering complicated question intents expressed decomposed question sequences. recent work semantic parsing question answering focused long complicated questions, many would seem unnatural asked normal conversation two humans. effort explore conversational qa setting, present realistic task: answering sequences simple inter-related questions. collect dataset 6,066 question sequences inquire semi-structured tables wikipedia, 17,553 question-answer pairs total. existing qa systems face two major problems evaluated dataset: (1) handling questions contain coreferences previous questions answers, (2) matching words phrases question corresponding entries associated table. conclude proposing strategies handle issues.",4 "risk aversion evolutionary adaptation. risk aversion common behavior universal humans animals alike. economists traditionally defined risk preferences curvature utility function. psychologists behavioral economists also make use concepts loss aversion probability weighting model risk aversion. neurophysiological evidence suggests loss aversion origins relatively ancient neural circuitries (e.g., ventral striatum). could thus evolutionary origin risk avoidance? study question evolving strategies adapt play equivalent mean payoff gamble. hypothesize risk aversion equivalent mean payoff gamble beneficial adaptation living small groups, find preference risk averse strategies evolves small populations less 1,000 individuals, agents exhibit strategy preference larger populations. further, discover risk aversion also evolve larger populations, population segmented small groups around 150 individuals. finally, observe risk aversion evolves gamble rare event large impact individual's fitness. findings align earlier reports humans lived small groups large portion evolutionary history. such, suggest rare, high-risk, high-payoff events mating mate competition could driven evolution risk averse behavior humans living small groups.",16 "deep learning models many parameters? information theory viewpoint. deep learning models often parameters observations, still perform well. sometimes described paradox. work, show experimentally despite huge number parameters, deep neural networks compress data losslessly even taking cost encoding parameters account. compression viewpoint originally motivated use variational methods neural networks. however, show variational methods provide surprisingly poor compression bounds, despite explicitly built minimize bounds. might explain relatively poor practical performance variational methods deep learning. better encoding methods, imported minimum description length (mdl) toolbox, yield much better compression values deep networks, corroborating hypothesis good compression training set correlates good test performance.",4 "global sensitivity analysis dependence measures. global sensitivity analysis variance-based measures suffers several theoretical practical limitations, since focus variance output handle multivariate variables limited way. paper, introduce new class sensitivity indices based dependence measures overcomes insufficiencies. approach originates idea compare output distribution conditional counterpart one input variables fixed. establish comparison yields previously proposed indices performed csiszar f-divergences, well sensitivity indices well-known dependence measures random variables. leads us investigate completely new sensitivity indices based recent state-of-the-art dependence measures, distance correlation hilbert-schmidt independence criterion. also emphasize potential feature selection techniques relying dependence measures alternatives screening high dimension.",12 "co-segmentation space-time co-located collections. present co-segmentation technique space-time co-located image collections. prevalent collections capture various dynamic events, usually multiple photographers, may contain multiple co-occurring objects necessarily part intended foreground object, resulting ambiguities traditional co-segmentation techniques. thus, disambiguate common foreground object is, introduce weakly-supervised technique, assume small seed, given form single segmented image. take distributed approach, local belief models propagated reinforced similar images. technique progressively expands foreground background belief models across entire collection. technique exploits power entire set image without building global model, thus successfully overcomes large variability appearance common foreground object. demonstrate method outperforms previous co-segmentation techniques challenging space-time co-located collections, including dense benchmark datasets adapted novel problem setting.",4 "neural attention models sequence classification: analysis application key term extraction dialogue act detection. recurrent neural network architectures combining attention mechanism, neural attention model, shown promising performance recently tasks including speech recognition, image caption generation, visual question answering machine translation. paper, neural attention model applied two sequence classification tasks, dialogue act detection key term extraction. sequence labeling tasks, model input sequence, output label input sequence. major difficulty sequence labeling input sequence long, include many noisy irrelevant part. information whole sequence treated equally, noisy irrelevant part may degrade classification performance. attention mechanism helpful sequence classification task capable highlighting important part among entire sequence classification task. experimental results show attention mechanism, discernible improvements achieved sequence labeling task considered here. roles attention mechanism tasks analyzed visualized paper.",4 "data augmentation via levy processes. document travel, may expect short snippets document also travel. introduce general framework incorporating types invariances discriminative classifier. framework imagines data drawn slice levy process. slice levy process earlier point time, obtain additional pseudo-examples, used train classifier. show scheme two desirable properties: preserves bayes decision boundary, equivalent fitting generative model limit rewind time back 0. construction captures popular schemes gaussian feature noising dropout training, well admitting new generalizations.",19 "orthogonal rank-one matrix pursuit low rank matrix completion. paper, propose efficient scalable low rank matrix completion algorithm. key idea extend orthogonal matching pursuit method vector case matrix case. propose economic version algorithm introducing novel weight updating rule reduce time storage complexity. versions computationally inexpensive matrix pursuit iteration, find satisfactory results iterations. another advantage proposed algorithm one tunable parameter, rank. easy understand use user. becomes especially important large-scale learning problems. addition, rigorously show versions achieve linear convergence rate, significantly better previous known results. also empirically compare proposed algorithms several state-of-the-art matrix completion algorithms many real-world datasets, including large-scale recommendation dataset netflix well movielens datasets. numerical results show proposed algorithm efficient competing algorithms achieving similar better prediction performance.",4 "using incomplete information complete weight annotation road networks -- extended version. witnessing increasing interests effective use road networks. example, enable effective vehicle routing, weighted-graph models transportation networks used, weight edge captures cost associated traversing edge, e.g., greenhouse gas (ghg) emissions travel time. precondition using graph model routing edges weights. weights capture travel times ghg emissions extracted gps trajectory data collected network. however, gps trajectory data typically lack coverage needed assign weights edges. paper formulates addresses problem annotating edges road network travel cost based weights set trips network cover small fraction edges, associated ground-truth travel cost. general framework proposed solve problem. specifically, problem modeled regression problem solved minimizing judiciously designed objective function takes account topology road network. particular, use weighted pagerank values edges explored assigning appropriate weights edges, property directional adjacency edges also taken account assign weights. empirical studies weights capturing travel time ghg emissions two road networks (skagen, denmark, north jutland, denmark) offer insight design properties proposed techniques offer evidence techniques effective.",4 "modeling events machines. notion events occupied central role modeling influence computer science philosophy. recent developments diagrammatic modeling made possible examine conceptual representation events. paper explores aspects notion events produced applying new diagrammatic methodology focus interaction events concepts time space, objects. proposed description applies abstract machines events form dynamic phases system. results nontechnical research utilized many fields notion event typically used interdisciplinary application.",4 "visual dynamics: probabilistic future frame synthesis via cross convolutional networks. study problem synthesizing number likely future frames single input image. contrast traditional methods, tackled problem deterministic non-parametric way, propose novel approach models future frames probabilistic manner. probabilistic model makes possible us sample synthesize many possible future frames single input image. future frame synthesis challenging, involves low- high-level image motion understanding. propose novel network structure, namely cross convolutional network aid synthesizing future frames; network structure encodes image motion information feature maps convolutional kernels, respectively. experiments, model performs well synthetic data, 2d shapes animated game sprites, well real-wold videos. also show model applied tasks visual analogy-making, present analysis learned network representations.",4 "new perspectives k-support cluster norms. $k$-support norm regularizer successfully applied sparse vector prediction problems. show belongs general class norms formulated parameterized infimum quadratics. extend $k$-support norm matrices, observe special case matrix cluster norm. using formulation derive efficient algorithm compute proximity operator norms. improves upon standard algorithm $k$-support norm allows us apply proximal gradient methods cluster norm. also describe solve regularization problems employ centered versions norms. finally, apply matrix regularizers different matrix completion multitask learning datasets. results indicate spectral $k$-support norm cluster norm give state art performance problems, significantly outperforming trace norm elastic net penalties.",19 "development n-type gm-phd filter multiple target, multiple type visual tracking. propose new framework extends standard probability hypothesis density (phd) filter multiple targets $n$ different types $n\geq2$ based random finite set (rfs) theory, taking account background false positives (clutter), also confusions among detections different target types, general different character background clutter. assumptions gaussianity linearity, framework extends existing gaussian mixture (gm) implementation standard phd filter create n-type gm-phd filter. methodology applied real video sequences integrating object detectors' information filter two scenarios. first scenario, tri-gm-phd filter ($n=3$) applied real video sequences containing three types multiple targets scene, two football teams referee, using separate confused detections. second scenario, use dual gm-phd filter ($n=2$) tracking pedestrians vehicles scene handling detectors' confusions. cases, munkres's variant hungarian assignment algorithm used associate tracked target identities frames. approach evaluated compared raw detection independent gm-phd filters using optimal sub-pattern assignment (ospa) metric discrimination rate. shows improved performance strategy real video sequences.",4 "efficient reinforcement learning using recursive least-squares methods. recursive least-squares (rls) algorithm one well-known algorithms used adaptive filtering, system identification adaptive control. popularity mainly due fast convergence speed, considered optimal practice. paper, rls methods used solve reinforcement learning problems, two new reinforcement learning algorithms using linear value function approximators proposed analyzed. two algorithms called rls-td(lambda) fast-ahc (fast adaptive heuristic critic), respectively. rls-td(lambda) viewed extension rls-td(0) lambda=0 general lambda within interval [0,1], multi-step temporal-difference (td) learning algorithm using rls methods. convergence probability one limit convergence rls-td(lambda) proved ergodic markov chains. compared existing ls-td(lambda) algorithm, rls-td(lambda) advantages computation suitable online learning. effectiveness rls-td(lambda) analyzed verified learning prediction experiments markov chains wide range parameter settings. fast-ahc algorithm derived applying proposed rls-td(lambda) algorithm critic network adaptive heuristic critic method. unlike conventional ahc algorithm, fast-ahc makes use rls methods improve learning-prediction efficiency critic. learning control experiments cart-pole balancing acrobot swing-up problems conducted compare data efficiency fast-ahc conventional ahc. experimental results, shown data efficiency learning control also improved using rls methods learning-prediction process critic. performance fast-ahc also compared ahc method using ls-td(lambda). furthermore, demonstrated experiments different initial values variance matrix rls-td(lambda) required get better performance learning prediction also learning control. experimental results analyzed based existing theoretical work transient phase forgetting factor rls methods.",4 "geometric cross-modal comparison heterogeneous sensor data. work, address problem cross-modal comparison aerial data streams. variety simulated automobile trajectories sensed using two different modalities: full-motion video, radio-frequency (rf) signals received detectors various locations. information represented two modalities compared using self-similarity matrices (ssms) corresponding time-ordered point clouds feature spaces data sources; note feature spaces entirely different scale dimensionality. several metrics comparing ssms explored, including cutting-edge time-warping technique simultaneously handle local time warping partial matches, also controlling change geometry feature spaces two modalities. note technique quite general, depend choice modalities. particular setting, demonstrate cross-modal distance ssms corresponding trajectory type smaller cross-modal distance ssms corresponding distinct trajectory types, formalize observation via precision-recall metrics experiments. finally, comment promising implications ideas future integration multiple-hypothesis tracking systems.",4 "local structure discovery bayesian networks. learning bayesian network structure data np-hard problem thus exact algorithms feasible small data sets. therefore, network structures larger networks usually learned various heuristics. another approach scaling structure learning local learning. local learning, modeler one target variables special interest; wants learn structure near target variables interested rest variables. paper, present score-based local learning algorithm called sll. conjecture algorithm theoretically sound sense optimal limit large sample size. empirical results suggest sll competitive compared constraint-based hiton algorithm. also study prospects constructing network structure whole node set based local results presenting two algorithms comparing several heuristics.",4 "digital synaptic neural substrate: new approach computational creativity. introduce new artificial intelligence (ai) approach called, 'digital synaptic neural substrate' (dsns). uses selected attributes objects various domains (e.g. chess problems, classical music, renowned artworks) recombines way generate new attributes then, principle, used create novel objects creative value humans relating one source domains. allows burden creative content generation passed humans machines. approach tested domain chess problem composition. used automatically compose numerous sets chess problems based attributes extracted recombined chess problems tournament games humans, renowned paintings, computer-evolved abstract art, photographs people, classical music tracks. quality generated chess problems assessed automatically using existing experimentally-validated computational chess aesthetics model. also assessed human experts domain. results suggest attributes collected recombined chess domains using dsns approach indeed used automatically generate chess problems reasonably high aesthetic quality. particular, low quality chess source (i.e. tournament game sequences weak players) used combination actual photographs people able produce three-move chess problems comparable quality better generated using high quality chess source (i.e. published compositions human experts), efficiently well. information foreign domain integrated functional way remains open question now. dsns approach is, principle, scalable applicable domain objects attributes represented using real numbers.",4 "attenuation correction brain pet imaging using deep neural network based dixon zte mr images. positron emission tomography (pet) functional imaging modality widely used neuroscience studies. obtain meaningful quantitative results pet images, attenuation correction necessary image reconstruction. pet/mr hybrid systems, pet attenuation challenging magnetic resonance (mr) images reflect attenuation coefficients directly. address issue, present deep neural network methods derive continuous attenuation coefficients brain pet imaging mr images. dixon mr images network input, existing u-net structure adopted analysis using forty patient data sets shows superior dixon based methods. dixon zero echo time (zte) images available, apart stacking multiple mr images along u-net input channels, proposed new network structure extract features dixon zte images independently early layers combine together later layers. quantitative analysis based fourteen real patient data sets demonstrates network approaches perform better standard methods, proposed network structure reduce pet quantification error compared u-net structure multiple inputs.",15 "sublabel-accurate discretization nonconvex free-discontinuity problems. work show sublabel-accurate multilabeling approaches derived approximating classical label-continuous convex relaxation nonconvex free-discontinuity problems. insight allows extend sublabel-accurate approaches total variation general convex nonconvex regularizations. furthermore, leads systematic approach discretization continuous convex relaxations. study relationship existing discretizations discrete-continuous mrfs. finally, apply proposed approach obtain sublabel-accurate convex solution vectorial mumford-shah functional show several experiments leads precise solutions using fewer labels.",4 "document clustering based topic maps. importance document clustering widely acknowledged researchers better management, smart navigation, efficient filtering, concise summarization large collection documents like world wide web (www). next challenge lies semantically performing clustering based semantic contents document. problem document clustering two main components: (1) represent document form inherently captures semantics text. may also help reduce dimensionality document, (2) define similarity measure based semantic representation assigns higher numerical values document pairs higher semantic relationship. feature space documents challenging document clustering. document may contain multiple topics, may contain large set class-independent general-words, handful class-specific core-words. features mind, traditional agglomerative clustering algorithms, based either document vector model (dvm) suffix tree model (stc), less efficient producing results high cluster quality. paper introduces new approach document clustering based topic map representation documents. document transformed compact form. similarity measure proposed based upon inferred information topic maps data structures. suggested method implemented using agglomerative hierarchal clustering tested standard information retrieval (ir) datasets. comparative experiment reveals proposed approach effective improving cluster quality.",4 "persistent soft-clique set sampled graphs. searching characteristic subpatterns potentially noisy graph data, appears self-evident multiple observations would better one. however, turns inconsistencies introduced different graph instances different edge sets pose serious challenge. work address challenge problem finding maximum weighted cliques. introduce concept persistent soft-clique. subset vertices, 1) almost fully least densely connected, 2) occurs almost graph instances, 3) maximum weight. present measure clique-ness, essentially counts number edge missing make subset vertices clique. measure, show problem finding persistent soft-clique problem cast either as: a) max-min two person game optimization problem, b) min-min soft margin optimization problem. formulations lead solution using partial lagrangian method solve optimization problems. experiments synthetic data real social network data, show proposed method able reliably find soft cliques graph data, even distorted random noise unreliable observations.",4 "general theory image normalization. give systematic, abstract formulation image normalization method applied general group image transformations, illustrate abstract analysis applying hierarchy viewing transformations planar object.",4 "exploiting linear structure within convolutional networks efficient evaluation. present techniques speeding test-time evaluation large convolutional networks, designed object recognition tasks. models deliver impressive accuracy image evaluation requires millions floating point operations, making deployment smartphones internet-scale clusters problematic. computation dominated convolution operations lower layers model. exploit linear structure present within convolutional filters derive approximations significantly reduce required computation. using large state-of-the-art models, demonstrate demonstrate speedups convolutional layers cpu gpu factor 2x, keeping accuracy within 1% original model.",4 "modelling competitive sports: bradley-terry-élő models supervised on-line learning paired competition outcomes. prediction modelling competitive sports outcomes received much recent attention, especially bayesian statistics machine learning communities. real world setting outcome prediction, seminal \'{e}l\h{o} update still remains, 50 years, valuable baseline difficult improve upon, though original form heuristic proper statistical ""model"". mathematically, \'{e}l\h{o} rating system closely related bradley-terry models, usually used explanatory fashion rather predictive supervised on-line learning setting. exploiting close link two model classes newly observed similarities, propose new supervised learning framework close similarities logistic regression, low-rank matrix completion neural networks. building it, formulate class structured log-odds models, unifying desirable properties found above: supervised probabilistic prediction scores wins/draws/losses, batch/epoch on-line learning, well possibility incorporate features prediction, without sacrifice simplicity, parsimony bradley-terry models, computational efficiency \'{e}l\h{o}'s original approach. validate structured log-odds modelling approach synthetic experiments english premier league outcomes, added expressivity yields best predictions reported state-of-art, close quality contemporary betting odds.",19 "theory formal synthesis via inductive learning. formal synthesis process generating program satisfying high-level formal specification. recent times, effective formal synthesis methods proposed based use inductive learning. refer class methods learn programs examples formal inductive synthesis. paper, present theoretical framework formal inductive synthesis. discuss formal inductive synthesis differs traditional machine learning. describe oracle-guided inductive synthesis (ogis), framework captures family synthesizers operate iteratively querying oracle. instance ogis much practical impact counterexample-guided inductive synthesis (cegis). present theoretical characterization cegis learning program computes recursive language. particular, analyze relative power cegis variants types counterexamples generated oracle varies. also consider impact bounded versus unbounded memory available learning algorithm. special case universe candidate programs finite, relate speed convergence notion teaching dimension studied machine learning theory. altogether, results paper take first step towards theoretical foundation emerging field formal inductive synthesis.",4 "deep-anomaly: fully convolutional neural network fast anomaly detection crowded scenes. detection abnormal behaviours crowded scenes deal many challenges. paper presents efficient method detection localization anomalies videos. using fully convolutional neural networks (fcns) temporal data, pre-trained supervised fcn transferred unsupervised fcn ensuring detection (global) anomalies scenes. high performance terms speed accuracy achieved investigating cascaded detection result reducing computation complexities. fcn-based architecture addresses two main tasks, feature representation cascaded outlier detection. experimental results two benchmarks suggest detection localization proposed method outperforms existing methods terms accuracy.",4 "fontcode: embedding information text documents using glyph perturbation. introduce fontcode, information embedding technique text documents. provided text document specific fonts, method embeds user-specified information text perturbing glyphs text characters preserving text content. devise algorithm chooses unobtrusive yet machine-recognizable glyph perturbations, leveraging recently developed generative model alters glyphs character continuously font manifold. introduce algorithm embeds user-provided message text document produces encoded document whose appearance minimally perturbed original document. also present glyph recognition method recovers embedded information encoded document stored vector graphic pixel image, even printed paper. addition, introduce new error-correction coding scheme rectifies certain number recognition errors. lastly, demonstrate technique enables wide array applications, using text document metadata holder, unobtrusive optical barcode, cryptographic message embedding scheme, text document signature.",4 "constraint-based sequence mining using constraint programming. goal constraint-based sequence mining find sequences symbols included large number input sequences satisfy constraints specified user. many constraints proposed literature, general framework still missing. investigate use constraint programming general framework task. first identify four categories constraints applicable sequence mining. propose two constraint programming formulations. first formulation introduces new global constraint called exists-embedding. formulation efficient support one type constraint. support constraints, develop second formulation general incurs overhead. formulations use projected database technique used specialised algorithms. experiments demonstrate flexibility towards constraint-based settings compare approach existing methods.",4 "ais-maca- z: maca based clonal classifier splicing site, protein coding promoter region identification eukaryotes. bioinformatics incorporates information regarding biological data storage, accessing mechanisms presentation characteristics within data. problems bioinformatics addressed efficiently computer techniques. paper aims building classifier based multiple attractor cellular automata (maca) uses fuzzy logic version z predict splicing site, protein coding promoter region identification eukaryotes. strengthened artificial immune system technique (ais), clonal algorithm choosing rules best fitness. proposed classifier handle dna sequences lengths 54,108,162,252,354. classifier gives exact boundaries protein promoter regions average accuracy 90.6%. classifier predict splicing site 97% accuracy. classifier tested 1, 97,000 data components taken fickett & toung , epdnew, sequences renowned medical university.",4 "semi-automatic method efficient detection stories social media. twitter become one main sources news many people. real-world events emergencies unfold, twitter abuzz hundreds thousands stories events. stories harmless, others could potentially life-saving sources malicious rumors. thus, critically important able efficiently track stories spread twitter events. paper, present novel semi-automatic tool enables users efficiently identify track stories real-world events twitter. ran user study 25 participants, demonstrating compared conventional methods, tool increase speed accuracy users track stories real-world events.",4 "learning invariant hilbert space domain adaptation. paper introduces learning scheme construct hilbert space (i.e., vector space along inner product) address unsupervised semi-supervised domain adaptation problems. achieved learning projections domain latent space along mahalanobis metric latent space simultaneously minimizing notion domain variance maximizing measure discriminatory power. particular, make use riemannian optimization techniques match statistical properties (e.g., first second order statistics) samples projected latent space different domains. upon availability class labels, deem samples sharing label form compact clusters pulling away samples coming different classes.we extensively evaluate contrast proposal state-of-the-art methods task visual domain adaptation using handcrafted deep-net features. experiments show even simple nearest neighbor classifier, proposed method outperform several state-of-the-art methods benefitting involved classification schemes.",4 "point linking network object detection. object detection core problem computer vision. development deep convnets, performance object detectors dramatically improved. deep convnets based object detectors mainly focus regressing coordinates bounding box, e.g., faster-r-cnn, yolo ssd. different methods considering bounding box whole, propose novel object bounding box representation using points links implemented using deep convnets, termed point linking network (pln). specifically, regress corner/center points bounding-box links using fully convolutional network; map corner points links back multiple bounding boxes; finally object detection result obtained fusing multiple bounding boxes. pln naturally robust object occlusion flexible object scale variation aspect ratio variation. experiments, pln inception-v2 model achieves state-of-the-art single-model single-scale results pascal voc 2007, pascal voc 2012 coco detection benchmarks without bells whistles. source code released.",4 "concept drift learning alternating learners. data-driven predictive analytics use today across number industrial applications, integration hindered requirement similarity among model training test data distributions. paper addresses need learning possibly nonstationary data streams, concept drift, commonly seen phenomenon practical applications. simple dual-learner ensemble strategy, alternating learners framework, proposed. long-memory model learns stable concepts long relevant time window, short-memory model learns transient concepts small recent window. difference prediction performance two models monitored induces alternating policy select, update reset two models. method features online updating mechanism maintain ensemble accuracy, concept-dependent trigger focus relevant data. empirical studies method demonstrates effective tracking prediction steaming data carry abrupt and/or gradual changes.",4 "nonapproximability results partially observable markov decision processes. show several variations partially observable markov decision processes, polynomial-time algorithms finding control policies unlikely simply guarantees finding policies within constant factor constant summand optimal. ""unlikely"" means ""unless complexity classes collapse,"" collapses considered p=np, p=pspace, p=exp. unless collapses shown hold, control-policy designer must choose performance guarantees efficient computation.",4 "n-body networks: covariant hierarchical neural network architecture learning atomic potentials. describe n-body networks, neural network architecture learning behavior properties complex many body physical systems. specific application learn atomic potential energy surfaces use molecular dynamics simulations. architecture novel (a) based hierarchical decomposition many body system subsytems, (b) activations network correspond internal state subsystem, (c) ""neurons"" network constructed explicitly guarantee activations covariant rotations, (d) neurons operate entirely fourier space, nonlinearities realized tensor products followed clebsch-gordan decompositions. part description network, give characterization way weights network may interact activations ensure covariance property maintained.",4 "magnet ""efficient defenses adversarial attacks"" robust adversarial examples. magnet ""efficient defenses..."" recently proposed defense adversarial examples. find construct adversarial examples defeat defenses slight increase distortion.",4 "dual extrapolation faster lasso solvers. convex sparsity-inducing regularizations ubiquitous high-dimension machine learning, non-differentiability requires use iterative solvers. accelerate solvers, state-of-the-art approaches consist reducing size optimization problem hand. context regression, achieved either discarding irrelevant features (screening techniques) prioritizing features likely included support solution (working set techniques). duality comes play several steps techniques. here, propose extrapolation technique starting sequence iterates dual leads construction improved dual point. enables tighter control optimality used stopping criterion, well better screening performance gap safe rules. finally, propose working set strategy based aggressive use gap safe rules new dual point construction, improves state-of-the-art time performance lasso problems.",19 "structured approach predicting image enhancement parameters. social networking mobile devices become commonplace everyday life. addition, photo capturing process become trivial due advances mobile imaging. hence people capture lot photos everyday want visually-attractive. given rise automated, one-touch enhancement tools. however, inability tools provide personalized content-adaptive enhancement paved way machine-learned methods same. existing typical machine-learned methods heuristically (e.g. knn-search) predict enhancement parameters new image relating image set similar training images. heuristic methods need constant interaction training images makes parameter prediction sub-optimal computationally expensive test time undesired. paper presents novel approach predicting enhancement parameters given new image using features, without using training images. propose model interaction image features corresponding enhancement parameters using matrix factorization (mf) principles. also propose way integrate image features mf formulation. show approach outperforms heuristic approaches well recent approaches mf structured prediction synthetic well real-world data image enhancement.",4 "building high-level features using large scale unsupervised learning. consider problem building high-level, class-specific feature detectors unlabeled data. example, possible learn face detector using unlabeled images? answer this, train 9-layered locally connected sparse autoencoder pooling local contrast normalization large dataset images (the model 1 billion connections, dataset 10 million 200x200 pixel images downloaded internet). train network using model parallelism asynchronous sgd cluster 1,000 machines (16,000 cores) three days. contrary appears widely-held intuition, experimental results reveal possible train face detector without label images containing face not. control experiments show feature detector robust translation also scaling out-of-plane rotation. also find network sensitive high-level concepts cat faces human bodies. starting learned features, trained network obtain 15.8% accuracy recognizing 20,000 object categories imagenet, leap 70% relative improvement previous state-of-the-art.",4 "real-time deep registration geodesic loss. aim increase capture range accelerate performance state-of-the-art inter-subject subject-to-template 3d registration, propose deep learning-based methods trained find 3d position arbitrarily oriented subjects anatomy based slices volumes medical images. this, propose regression cnns learn predict angle-axis representation 3d rotations translations using image features. use compare mean square error geodesic loss training regression cnns two different scenarios: 3d pose estimation slices 3d 3d registration. exemplary application, applied proposed methods register arbitrarily oriented reconstructed images fetuses scanned in-utero wide gestational age range standard atlas space. results show registration applications amendable learning, proposed deep learning methods geodesic loss minimization achieve accurate results wide capture range real-time (<100ms). tested generalization capability trained cnns expanded age range images newborn subjects similar different mr image contrasts. trained models t2-weighted fetal brain mri scans used predict 3d position newborn brains based t1-weighted mri scans. showed trained models generalized well new domain performed image contrast transfer conditional generative adversarial network. indicates domain application trained deep regression cnns expanded image modalities contrasts used training. combination proposed methods optimization-based registration algorithms dramatically enhance performance automatic imaging devices image processing methods future.",4 "investigating effects dynamic precision scaling neural network training. training neural networks time- compute-intensive operation. mainly due large amount floating point tensor operations required training. constraints limit scope design space explorations (in terms hyperparameter search) data scientists researchers. recent work explored possibility reducing numerical precision used represent parameters, activations, gradients neural network training way reduce computational cost training (and thus reducing training time). paper develop novel dynamic precision scaling scheme evaluate performance, comparing previous works. using stochastic fixed-point rounding, quantization-error based scaling scheme, dynamic bit-widths training, achieve 98.8% test accuracy mnist dataset using average bit-width 16 bits weights 14 bits activations. beats previous state-of-the-art dynamic bit-width precision scaling algorithm.",4 "understanding model counting $β$-acyclic cnf-formulas. extend knowledge so-called structural restrictions $\mathrm{\#sat}$ giving polynomial time algorithm $\beta$-acyclic $\mathrm{\#sat}$. contrast previous algorithms area, algorithm proceed dynamic programming works along elimination order, solving weighted version constraint satisfaction. moreover, give evidence deviation standard algorithm coincidence, likely dynamic programming algorithm usual style $\beta$-acyclic $\mathrm{\#sat}$.",4 "master's thesis : deep learning visual recognition. goal research develop methods advancing automatic visual recognition. order predict unique multiple labels associated image, study different kind deep neural networks architectures methods supervised features learning. first draw state-of-the-art review convolutional neural networks aiming understand history behind family statistical models, limit modern architectures novel techniques currently used train deep cnns. originality work lies approach focusing tasks low amount data. introduce different models techniques achieve best accuracy several kind datasets, medium dataset food recipes (100k images) building web api, small dataset satellite images (6,000) dsg online challenge we've won. also draw state-of-the-art weakly supervised learning, introducing different kind cnns able localize regions interest. last contribution framework, build top torch7, training testing deep models visual recognition tasks datasets scale.",4 "improved modified cholesky decomposition method inverse covariance matrix estimation. modified cholesky decomposition commonly used inverse covariance matrix estimation given specified order random variables. however, order variables often available cannot pre-determined. hence, propose novel estimator address variable order issue modified cholesky decomposition estimate sparse inverse covariance matrix. key idea effectively combine set estimates obtained multiple permutations variable orders, efficiently encourage sparse structure resultant estimate use thresholding technique combined cholesky factor matrix. consistent property proposed estimate established weak regularity conditions. simulation studies show superior performance proposed method comparison several existing approaches. also apply proposed method linear discriminant analysis analyzing real-data examples classification.",19 "randomized smoothing stochastic optimization. analyze convergence rates stochastic optimization procedures non-smooth convex optimization problems. combining randomized smoothing techniques accelerated gradient methods, obtain convergence rates stochastic optimization procedures, expectation high probability, optimal dependence variance gradient estimates. best knowledge, first variance-based rates non-smooth optimization. give several applications results statistical estimation problems, provide experimental results demonstrate effectiveness proposed algorithms. also describe combination algorithm recent work decentralized optimization yields distributed stochastic optimization algorithm order-optimal.",12 "code minimization fringe projection based 3d stereo sensors calibration improvement. code minimization provides speed-up processing time fringe projection based stereo sensors possibly makes real-time applicable. paper reports methodology enables sensors completely omit gray code additional code. sequence sinusoidal images necessary. code reduction achieved involvement projection unit measurement, double triangulation, precise projector calibration significant projector calibration improvement, respectively.",12 "ethics robotics. three laws robotics first appeared together isaac asimov's story 'runaround' mentioned form previous works asimov. three laws commonly known three laws robotics earliest forms depiction needs ethics robotics. simplistic language isaac asimov able explain rules robot must confine order maintain societal sanctity. however, even though outdated still represent innate fears beginning resurface present day 21st century. society advent new revolution; revolution led advances computer science, artificial intelligence & nanotechnology. advances phenomenal surpassed predicted moore's law. advancements comes fear future may mercy androids. humans today scared we, ourselves, might create something cannot control. may end creating something learn much faster anyone us can, also evolve faster theory evolution allowed us to. greatest fear might lose jobs intelligent beings, beings might end replacing us top cycle. public hysteria heightened number cultural works depict annihilation human race robots. right frankenstein i, robot mass media also depicted issues. paper effort understand need ethics robotics simply termed roboethics. achieved study artificial beings thought put behind them. end paper, however, concluded need ethical robots ever need ethical roboticists.",4 "generative openmax multi-class open set classification. present conceptually new flexible method multi-class open set classification. unlike previous methods unknown classes inferred respect feature decision distance known classes, approach able provide explicit modelling decision score unknown classes. proposed method, called gener- ative openmax (g-openmax), extends openmax employing generative adversarial networks (gans) novel category image synthesis. validate proposed method two datasets handwritten digits characters, resulting superior results previous deep learning based method openmax moreover, g-openmax provides way visualize samples representing unknown classes open space. simple effective approach could serve new direction tackle challenging multi-class open set classification problem.",4 "new approach solution economic dispatch using particle swarm optimization simulated annealing. new approach solution economic dispatch using particle swarm optimization presented. progression allocating production amongst dedicated units restriction forced fulfilled power needs reduced. just, soft computing method received supplementary concentration used quantity successful sensible applications. here, attempt made find minimum cost using particle swarm optimization algorithm using data three generating units. work, data taken loss coefficients max-min power limit cost function. pso simulated annealing functional put least amount dissimilar energy requirements. outputs compared conventional method, pso seems give improved result enhanced convergence feature. methods executed matlab environment. effectiveness feasibility proposed method demonstrated three generating units case study. output gives hopeful results, signifying projected method calculation competent economically formative advanced eminence solutions addressing economic dispatch problems.",4 "database transposition constrained (closed) pattern mining. recently, different works proposed new way mine patterns databases pathological size. example, experiments genome biology usually provide databases thousands attributes (genes) tens objects (experiments). case, mining ""transposed"" database runs smaller search space, galois connection allows infer closed patterns original database. focus constrained pattern mining unusual databases give theoretical framework database constraint transposition. discuss properties constraint transposition look classical constraints. address problem generating closed patterns original database satisfying constraint, starting mined ""transposed"" database. finally, show generate patterns satisfying constraint closed ones.",4 "bayesian multitask learning latent hierarchies. learn multiple hypotheses related tasks latent hierarchical relationship tasks. exploit intuition domain adaptation, wish share classifier structure, multitask learning, wish share covariance structure. hierarchical model seen subsume several previously proposed multitask learning models performs well three distinct real-world data sets.",4 "practical reasoning expressive description logics. description logics (dls) family knowledge representation formalisms mainly characterised constructors build complex concepts roles atomic ones. expressive role constructors important many applications, computationally problematical. present algorithm decides satisfiability dl alc extended transitive inverse roles functional restrictions respect general concept inclusion axioms role hierarchies; early experiments indicate algorithm well-suited implementation. additionally, show alc extended transitive inverse roles still pspace. investigate limits decidability family dls, showing relaxing constraints placed kinds roles used number restrictions leads undecidability inference problems. finally, describe number optimisation techniques crucial obtaining implementations decision procedures, which, despite worst-case complexity problem, exhibit good performance real-life problems.",4 "cnndroid: gpu-accelerated execution trained deep convolutional neural networks android. many mobile applications running smartphones wearable devices would potentially benefit accuracy scalability deep cnn-based machine learning algorithms. however, performance energy consumption limitations make execution computationally intensive algorithms mobile devices prohibitive. present gpu-accelerated library, dubbed cnndroid, execution trained deep cnns android-based mobile devices. empirical evaluations show cnndroid achieves 60x speedup 130x energy saving current mobile devices. cnndroid open source library available download https://github.com/encp/cnndroid",4 "efficient first order methods linear composite regularizers. wide class regularization problems machine learning statistics employ regularization term obtained composing simple convex function \omega linear transformation. setting includes group lasso methods, fused lasso total variation methods, multi-task learning methods many more. paper, present general approach computing proximity operator class regularizers, assumption proximity operator function \omega known advance. approach builds recent line research optimal first order optimization methods uses fixed point iterations numerically computing proximity operator. general current approaches and, show numerical simulations, computationally efficient available first order methods achieve optimal rate. particular, method outperforms state art o(1/t) methods overlapping group lasso matches optimal o(1/t^2) methods fused lasso tree structured group lasso.",4 "word sense disambiguation: complex network approach. recent years, concepts methods complex networks employed tackle word sense disambiguation (wsd) task representing words nodes, connected semantically similar. despite increasingly number studies carried models, use networks represent data, pattern recognition performed attribute space performed using traditional learning techniques. words, structural relationship words explicitly used pattern recognition process. addition, investigations probed suitability representations based bipartite networks graphs (bigraphs) problem, many approaches consider possible links words. context, assess relevance bipartite network model representing feature words (i.e. words characterizing context) target (ambiguous) words solve ambiguities written texts. here, focus semantical relationships two type words, disregarding relationships feature words. special, proposed method serves represent texts graphs, also constructs structure discrimination senses accomplished. results revealed proposed learning algorithm bipartite networks provides excellent results mostly topical features employed characterize context. surprisingly, method even outperformed support vector machine algorithm particular cases, advantage robust even small training dataset available. taken together, results obtained show proposed representation/classification method might useful improve semantical characterization written texts.",4 "representer theorem? vector versus matrix regularizers. consider general class regularization methods learn vector parameters basis linear measurements. well known regularizer nondecreasing function inner product learned vector linear combination input data. result, known {\em representer theorem}, basis kernel-based methods machine learning. paper, prove necessity condition, thereby completing characterization kernel methods based regularization. extend analysis regularization methods learn matrix, problem motivated application multi-task learning. context, study general representer theorem, holds larger class regularizers. provide necessary sufficient condition class matrix regularizers highlight concrete examples practical importance. analysis uses basic principles matrix theory, especially useful notion matrix nondecreasing function.",4 "wrpn: wide reduced-precision networks. computer vision applications, prior works shown efficacy reducing numeric precision model parameters (network weights) deep neural networks. activation maps, however, occupy large memory footprint training inference step using mini-batches inputs. one way reduce large memory footprint reduce precision activations. however, past works shown reducing precision activations hurts model accuracy. study schemes train networks scratch using reduced-precision activations without hurting accuracy. reduce precision activation maps (along model parameters) increase number filter maps layer, find scheme matches surpasses accuracy baseline full-precision network. result, one significantly improve execution efficiency (e.g. reduce dynamic memory footprint, memory bandwidth computational energy) speed training inference process appropriate hardware support. call scheme wrpn - wide reduced-precision networks. report results show wrpn scheme better previously reported accuracies ilsvrc-12 dataset computationally less expensive compared previously reported reduced-precision networks.",4 "convergence guarantees kernel-based quadrature rules misspecified settings. kernel-based quadrature rules becoming important machine learning statistics, achieve super-$\sqrt{n}$ convergence rates numerical integration, thus provide alternatives monte carlo integration challenging settings integrands expensive evaluate integrands high dimensional. rules based assumption integrand certain degree smoothness, expressed integrand belongs certain reproducing kernel hilbert space (rkhs). however, assumption violated practice (e.g., integrand black box function), general theory established convergence kernel quadratures misspecified settings. contribution proving kernel quadratures consistent even integrand belong assumed rkhs, i.e., integrand less smooth assumed. specifically, derive convergence rates depend (unknown) lesser smoothness integrand, degree smoothness expressed via powers rkhss via sobolev spaces.",19 "deep reinforcement learning macro-actions. deep reinforcement learning shown powerful framework learning policies complex high-dimensional sensory inputs actions complex tasks, atari domain. paper, explore output representation modeling form temporal abstraction improve convergence reliability deep reinforcement learning approaches. concentrate macro-actions, evaluate different atari 2600 games, show yield significant improvements learning speed. additionally, show even achieve better scores dqn. offer analysis explanation convergence final results, revealing problem deep rl approaches sparse reward signals.",4 "end-to-end training whole image breast cancer diagnosis using convolutional design. develop end-to-end training algorithm whole-image breast cancer diagnosis based mammograms. requires lesion annotations first stage training. that, whole image classifier trained using image level labels. greatly reduced reliance lesion annotations. approach implemented using convolutional design simple yet provides superior performance comparison previous methods. ddsm, best single-model achieves per-image auc score 0.88 three-model averaging increases score 0.91. inbreast, best single-model achieves per-image auc score 0.96. using ddsm benchmark, models compare favorably current state-of-the-art. also demonstrate whole image model trained ddsm easily transferred inbreast without using lesion annotations using small amount training data. code availability: https://github.com/lishen/end2end-all-conv",4 "bayesian group factor analysis. introduce factor analysis model summarizes dependencies observed variable groups, instead dependencies individual variables standard factor analysis does. group may correspond one view set objects, one many data sets tied co-occurrence, set alternative variables collected statistics tables measure one property interest. show assuming group-wise sparse factors, active subset sets, variation decomposed factors explaining relationships sets factors explaining away set-specific variation. formulate assumptions bayesian model provides factors, apply model two data analysis tasks, neuroimaging chemical systems biology.",19 "propositional satisfiability answer-set programming. show propositional logic extensions support answer-set programming way stable logic programming disjunctive logic programming do. end, introduce logic based logic propositional schemata version closed world assumption. call extended logic propositional schemata cwa (ps+, symbols). important feature logic supports explicit modeling constraints cardinalities sets. paper, characterize class problems solved finite ps+ theories. implement programming system based logic ps+ design implement solver processing theories ps+. present encouraging performance results approach --- show competitive smodels, state-of-the-art answer-set programming system based stable logic programming.",4 "connecting language knowledge bases embedding models relation extraction. paper proposes novel approach relation extraction free text trained jointly use information text existing knowledge. model based two scoring functions operate learning low-dimensional embeddings words entities relationships knowledge base. empirically show new york times articles aligned freebase relations approach able efficiently use extra information provided large subset freebase data (4m entities, 23k relationships) improve existing methods rely text features alone.",4 "jointly learning sentence embeddings syntax unsupervised tree-lstms. introduce neural network represents sentences composing words according induced binary parse trees. use tree-lstm composition function, applied along tree structure found fully differentiable natural language chart parser. model simultaneously optimises composition function parser, thus eliminating need externally-provided parse trees normally required tree-lstm. therefore seen tree-based rnn unsupervised respect parse trees. fully differentiable, model easily trained off-the-shelf gradient descent method backpropagation. demonstrate achieves better performance compared various supervised tree-lstm architectures textual entailment task reverse dictionary task.",4 "multilabel classification r package mlr. implemented several multilabel classification algorithms machine learning package mlr. implemented methods binary relevance, classifier chains, nested stacking, dependent binary relevance stacking, used base learner accessible mlr. moreover, access multilabel classification versions randomforestsrc rferns. methods easily compared different implemented multilabel performance measures resampling methods standardized mlr framework. benchmark experiment several multilabel datasets, performance different methods evaluated.",19 "opennmt: open-source toolkit neural machine translation. describe open-source toolkit neural machine translation (nmt). toolkit prioritizes efficiency, modularity, extensibility goal supporting nmt research model architectures, feature representations, source modalities, maintaining competitive performance reasonable training requirements. toolkit consists modeling translation support, well detailed pedagogical documentation underlying techniques.",4 "foreground segmentation using triplet convolutional neural network multiscale feature encoding. common approach moving objects segmentation scene perform background subtraction. several methods proposed domain. however, lack ability handling various difficult scenarios illumination changes, background camera motion, camouflage effect, shadow etc. address issues, propose robust flexible encoder-decoder type neural network based approach. adapt pre-trained convolutional network, i.e. vgg-16 net, triplet framework encoder part embed image multiple scales feature space use transposed convolutional network decoder part learn mapping feature space image space. train network end-to-end using training samples. network takes rgb image three different scales produces foreground segmentation probability mask corresponding image. order evaluate model, entered change detection 2014 challenge (changedetection.net) method outperformed existing state-of-the-art methods average f-measure 0.9770. source code made publicly available https://github.com/lim-anggun/fgsegnet.",4 "flexible statistical inference mechanistic models neural dynamics. mechanistic models single-neuron dynamics extensively studied computational neuroscience. however, identifying models quantitatively reproduce empirically measured data challenging. propose overcome limitation using likelihood-free inference approaches (also known approximate bayesian computation, abc) perform full bayesian inference single-neuron models. approach builds recent advances abc learning neural network maps features observed data posterior distribution parameters. learn bayesian mixture-density network approximating posterior multiple rounds adaptively chosen simulations. furthermore, propose efficient approach handling missing features parameter settings simulator fails, well strategy automatically learning relevant features using recurrent neural networks. synthetic data, approach efficiently estimates posterior distributions recovers ground-truth parameters. in-vitro recordings membrane voltages, recover multivariate posteriors biophysical parameters, yield model-predicted voltage traces accurately match empirical data. approach enable neuroscientists perform bayesian inference complex neuron models without design model-specific algorithms, closing gap mechanistic statistical approaches single-neuron modelling.",19 "cross-domain semantic parsing via paraphrasing. existing studies semantic parsing mainly focus in-domain setting. formulate cross-domain semantic parsing domain adaptation problem: train semantic parser source domains adapt target domain. due diversity logical forms different domains, problem presents unique intriguing challenges. converting logical forms canonical utterances natural language, reduce semantic parsing paraphrasing, develop attentive sequence-to-sequence paraphrase model general flexible adapt different domains. discover two problems, small micro variance large macro variance, pre-trained word embeddings hinder direct use neural networks, propose standardization techniques remedy. popular overnight dataset, contains eight domains, show cross-domain training standardized pre-trained word embeddings bring significant improvement.",4 "grammar variational autoencoder. deep generative models wildly successful learning coherent latent representations continuous data video audio. however, generative modeling discrete data arithmetic expressions molecular structures still poses significant challenges. crucially, state-of-the-art methods often produce outputs valid. make key observation frequently, discrete data represented parse tree context-free grammar. propose variational autoencoder encodes decodes directly parse trees, ensuring generated outputs always valid. surprisingly, show model often generate valid outputs, also learns coherent latent space nearby points decode similar discrete outputs. demonstrate effectiveness learned models showing improved performance bayesian optimization symbolic regression molecular synthesis.",19 "deep cnns along time axis intermap pooling robustness spectral variations. convolutional neural networks (cnns) convolutional pooling operations along frequency axis proposed attain invariance frequency shifts features. however, inappropriate regard fact acoustic features vary frequency. paper, contend convolution along time axis effective. also propose addition intermap pooling (imp) layer deep cnns. layer, filters group extract common spectrally variant features, layer pools feature maps group. result, proposed imp cnn achieve insensitivity spectral variations characteristic different speakers utterances. effectiveness imp cnn architecture demonstrated several lvcsr tasks. even without speaker adaptation techniques, architecture achieved wer 12.7% swb part hub5'2000 evaluation test set, competitive state-of-the-art methods.",4 "large scale, large margin classification using indefinite similarity measures. despite success popular kernelized support vector machines, two major limitations: restricted positive semi-definite (psd) kernels, training complexity scales least quadratically size data. many natural measures similarity pairs samples psd e.g. invariant kernels, implicitly explicitly defined latent variable models. paper, investigate scalable approaches using indefinite similarity measures large margin frameworks. particular show normalization similarity subset data points constitutes representation suitable linear classifiers. result classifier competitive kernelized svm terms accuracy, despite better training test time complexities. experimental results demonstrate cifar-10 dataset, model equipped similarity measures invariant rigid non-rigid deformations, made 5 times sparser accurate kernelized svm using rbf kernels.",4 "spectralleader: online spectral learning single topic models. study problem learning latent variable model stream data. latent variable models popular practice explain observed data terms unobserved concepts. models traditionally studied offline setting. online em arguably popular algorithm learning latent variable models online. although computationally efficient, typically converges local optimum. work, develop new online learning algorithm latent variable models, call spectralleader. spectralleader always converges global optimum, derive $o(\sqrt{n})$ upper bound log factors $n$-step regret bag-of-words model. show spectralleader performs similarly better online em tuned hyper-parameters, synthetic real-world experiments.",4 "boosting product experts. paper, derive novel probabilistic model boosting product experts. re-derive boosting algorithm greedy incremental model selection procedure ensures addition new experts ensemble decrease likelihood data. learning rules lead generic boosting algorithm - poe- boost turns similar adaboost algorithm certain assumptions expert probabilities. paper extends poeboost algorithm poeboost.cs handles hypothesis produce probabilistic predictions. new algorithm shown better generalization performance compared state art algorithms.",4 "deep encoding etymological information tei. paper aims provide comprehensive modeling representation etymological data digital dictionaries. purpose integrate one coherent framework digital representations legacy dictionaries, also born-digital lexical databases constructed manually semi-automatically. want propose systematic coherent set modeling principles variety etymological phenomena may contribute creation continuum existing future lexical constructs, anyone interested tracing history words meanings able seamlessly query lexical resources.instead designing ad hoc model representation language digital etymological data, focus identifying possibilities offered tei guidelines representation lexical information.",4 "inductive policy selection first-order mdps. select policies large markov decision processes (mdps) compact first-order representations. find policies generalize well number objects domain grows, potentially without bound. existing dynamic-programming approaches based flat, propositional, first-order representations either impractical naturally scale number objects grows without bound. implement evaluate alternative approach induces first-order policies using training data constructed solving small problem instances using pgraphplan (blum & langford, 1999). policies represented ensembles decision lists, using taxonomic concept language. approach extends work martin geffner (2000) stochastic domains, ensemble learning, wider variety problems. empirically, find ""good"" policies several stochastic first-order mdps beyond scope previous approaches. also discuss application work relational reinforcement-learning problem.",4 "multi-view learning structured non-identical outputs. many machine learning problems, labeled training data limited unlabeled data ample. problems instances factored multiple views, nearly sufficent determining correct labels. paper present new algorithm probabilistic multi-view learning uses idea stochastic agreement views regularization. algorithm works structured unstructured problems easily generalizes partial agreement scenarios. full agreement case, algorithm minimizes bhattacharyya distance models view, performs better coboosting two-view perceptron several flat structured classification problems.",4 "federated multi-task learning. federated learning poses new statistical systems challenges training machine learning models distributed networks devices. work, show multi-task learning naturally suited handle statistical challenges setting, propose novel systems-aware optimization method, mocha, robust practical systems issues. method theory first time consider issues high communication cost, stragglers, fault tolerance distributed multi-task learning. resulting method achieves significant speedups compared alternatives federated setting, demonstrate simulations real-world federated datasets.",4 "traversing environments using possibility graphs humanoid robots. locomotion legged robots poses considerable challenges confronted obstacles adverse environments. footstep planners typically designed one mode locomotion, traversing unfavorable environments may require several forms locomotion sequenced together, walking, crawling, jumping. multi-modal motion planners used address problems, existing implementations tend time-consuming limited quasi-static actions. paper presents motion planning method traverse complex environments using multiple categories actions. introduce concept ""possibility graph"", uses high-level approximations constraint manifolds rapidly explore ""possibility"" actions, thereby allowing lower-level single-action motion planners utilized efficiently. show possibility graph quickly find paths several different challenging environments require various combinations actions order traverse.",4 "hierarchical learning dnn-based acoustic scene classification. paper, present deep neural network (dnn)-based acoustic scene classification framework. two hierarchical learning methods proposed improve dnn baseline performance incorporating hierarchical taxonomy information environmental sounds. firstly, parameters dnn initialized proposed hierarchical pre-training. multi-level objective function adopted add constraint cross-entropy based loss function. series experiments conducted task1 detection classification acoustic scenes events (dcase) 2016 challenge. final dnn-based system achieved 22.9% relative improvement average scene classification error compared gaussian mixture model (gmm)-based benchmark system across four standard folds.",4 "mastering dungeon: grounded language learning mechanical turker descent. contrary natural language processing research, makes use static datasets, humans learn language interactively, grounded environment. work propose interactive learning procedure called mechanical turker descent (mtd) use train agents execute natural language commands grounded fantasy text adventure game. mtd, turkers compete train better agents short term, collaborate sharing agents' skills long term. results gamified, engaging experience turkers better quality teaching signal agents compared static datasets, turkers naturally adapt training data agent's abilities.",4 "inductive representation learning large graphs. low-dimensional embeddings nodes large graphs proved extremely useful variety prediction tasks, content recommendation identifying protein functions. however, existing approaches require nodes graph present training embeddings; previous approaches inherently transductive naturally generalize unseen nodes. present graphsage, general, inductive framework leverages node feature information (e.g., text attributes) efficiently generate node embeddings previously unseen data. instead training individual embeddings node, learn function generates embeddings sampling aggregating features node's local neighborhood. algorithm outperforms strong baselines three inductive node-classification benchmarks: classify category unseen nodes evolving information graphs based citation reddit post data, show algorithm generalizes completely unseen graphs using multi-graph dataset protein-protein interactions.",4 "fast low-rank matrix estimation without condition number. paper, study general problem optimizing convex function $f(l)$ set $p \times p$ matrices, subject rank constraints $l$. however, existing first-order methods solving problems either slow converge, require multiple invocations singular value decompositions. hand, factorization-based non-convex algorithms, much faster, require stringent assumptions \emph{condition number} optimum. paper, provide novel algorithmic framework achieves best worlds: asymptotically fast factorization methods, requiring dependency condition number. instantiate general framework three important matrix estimation problems impact several practical applications; (i) \emph{nonlinear} variant affine rank minimization, (ii) logistic pca, (iii) precision matrix estimation probabilistic graphical model learning. derive explicit bounds sample complexity well running time approach, show achieves best possible bounds cases. also provide extensive range experimental results, demonstrate algorithm provides attractive tradeoff estimation accuracy running time.",19 "rdf annotation second life objects: knowledge representation meets social virtual reality. designed implemented application running inside second life supports user annotation graphical objects graphical visualization concept ontologies, thus providing formal, machine-accessible description objects. result, offer platform combines graphical knowledge representation expected muve artifact semantic structure given resource framework description (rdf) representation information.",4 "accelerating partial-order planners: techniques effective search control pruning. propose domain-independent techniques bringing well-founded partial-order planners closer practicality. first two techniques aimed improving search control keeping overhead costs low. one based simple adjustment default a* heuristic used ucpop select plans refinement. based preferring ``zero commitment'' (forced) plan refinements whenever possible, using lifo prioritization otherwise. radical technique use operator parameter domains prune search. domains initially computed definitions operators initial goal conditions, using polynomial-time algorithm propagates sets constants operator graph, starting initial conditions. planning, parameter domains used prune nonviable operator instances remove spurious clobbering threats. experiments based modifications ucpop, improved plan goal selection strategies gave speedups factors ranging 5 1000 variety problems nontrivial unmodified version. crucially, hardest problems gave greatest improvements. pruning technique based parameter domains often gave speedups order magnitude difficult problems, default ucpop search strategy improved strategy. lisp code techniques test problems provided on-line appendices.",4 "fast online clustering randomized skeleton sets. present new fast online clustering algorithm reliably recovers arbitrary-shaped data clusters high throughout data streams. unlike existing state-of-the-art online clustering methods based k-means k-medoid, make restrictive generative assumptions. addition, contrast existing nonparametric clustering techniques dbscan denstream, gives provable theoretical guarantees. achieve fast clustering, propose represent cluster skeleton set updated continuously new data seen. skeleton set consists weighted samples data weights encode local densities. size skeleton set adapted according cluster geometry. proposed technique automatically detects number clusters robust outliers. algorithm works infinite data stream one pass data feasible. provide theoretical guarantees quality clustering also demonstrate advantage existing state-of-the-art several datasets.",4 "least generalizations greatest specializations sets clauses. main operations inductive logic programming (ilp) generalization specialization, make sense generality order. ilp, three important generality orders subsumption, implication implication relative background knowledge. two languages used often languages clauses languages horn clauses. gives total six different ordered languages. paper, give systematic treatment existence non-existence least generalizations greatest specializations finite sets clauses six ordered sets. survey results already obtained others also contribute answers own. main new results are, firstly, existence computable least generalization implication every finite set clauses containing least one non-tautologous function-free clause (among other, necessarily function-free clauses). secondly, show least generalization need exist relative implication, even set generalized background knowledge function-free. thirdly, give complete discussion existence non-existence greatest specializations six ordered languages.",4 "predicting industry users social media. automatic profiling social media users important task supporting multitude downstream applications. number studies used social media content extract study collective social attributes, lack substantial research addresses detection user's industry. frame task classification using feature engineering ensemble learning. industry-detection system uses posted content profile information detect user's industry 64.3% accuracy, significantly outperforming majority baseline taxonomy fourteen industry classes. qualitative analysis suggests person's industry affects words used perceived meanings, also number type emotions expressed.",4 "variational inference policy gradient. inspired seminal work stein variational inference stein variational policy gradient, derived method generate samples posterior variational parameter distribution \textit{explicitly} minimizing kl divergence match target distribution amortize fashion. consequently, applied varational inference technique vanilla policy gradient, trpo ppo bayesian neural network parameterizations reinforcement learning problems.",4 "tumor motion tracking liver ultrasound images using mean shift active contour. paper present new method motion tracking tumors liver ultrasound image sequences. algorithm two main steps. first step, apply mean shift algorithm multiple features estimate center target frame. target first frame defined using ellipse. edge, texture, intensity features extracted first frame, mean shift algorithm applied feature separately find center ellipse related feature next frame. center ellipse weighted average centers. using mean shift actually estimate target movement two consecutive frames. correct ellipsoid frame known, second step apply dynamic directional gradient vector flow (ddgvf) version active contour models, order find correct boundary tumors. sample points boundary active contour translate points based translation center ellipsoid two consecutive frames determine target movement. use translated sample points initial guess active contour next frame. experimental results show that, suggested method provides reliable performance liver tumor tracking ultrasound image sequences.",4 "supervised feature evaluation consistency analysis: application measure sets used characterise geographic objects. nowadays, supervised learning commonly used many domains. indeed, many works propose learn new knowledge examples translate expected behaviour considered system. key issue supervised learning concerns description language used represent examples. paper, propose method evaluate feature set used describe them. method based computation consistency example base. carried case study domain geomatic order evaluate sets measures used characterise geographic objects. case study shows method allows give relevant evaluations measure sets.",4 "integration spatio-temporal contrast sensitivity multi-slice channelized hotelling observer. barten's model spatio-temporal contrast sensitivity function human visual system embedded multi-slice channelized hotelling observer. done 3d filtering stack images spatio-temporal contrast sensitivity function feeding result (i.e., perceived image stack) multi-slice channelized hotelling observer. proposed procedure considering spatio-temporal contrast sensitivity function generic sense used observers multi-slice channelized hotelling observer. detection performance new observer digital breast tomosynthesis measured variety browsing speeds, two spatial sampling rates, using computer simulations. results show peak detection performance mid browsing speeds. compare results human observer study reported earlier (i. diaz et al. spie mi 2011). effects display luminance, contrast spatial sampling rate, without considering foveal vision, also studied. reported simulations conducted real digital breast tomosynthesis image stacks, well stacks anthropomorphic software breast phantom (p. bakic et al. med phys. 2011). lesion cases simulated inserting single micro-calcifications masses. limitations methods ways improve discussed.",4 "metrics matter! incompatibility different flavors replanning. autonomous agents executing real world, state world well objectives agent may change agent's original model. cases, agent's planning process must modify plan execution make amenable new conditions, resume execution. brings replanning problem, various techniques proposed solve it. all, three main techniques -- based three different metrics -- proposed prior automated planning work. open question whether metrics interchangeable; answering requires normalized comparison various replanning quality metrics. paper, show possible support comparison compiling respective techniques single substrate. using novel compilation, demonstrate different metrics interchangeable, good surrogates other. thus focus attention incompatibility various replanning flavors other, founded differences metrics respectively seek optimize.",4 "interpretable explanations black boxes meaningful perturbation. machine learning algorithms increasingly applied high impact yet high risk tasks, medical diagnosis autonomous driving, critical researchers explain algorithms arrived predictions. recent years, number image saliency methods developed summarize highly complex neural networks ""look"" image evidence predictions. however, techniques limited heuristic nature architectural constraints. paper, make two main contributions: first, propose general framework learning different kinds explanations black box algorithm. second, specialise framework find part image responsible classifier decision. unlike previous works, method model-agnostic testable grounded explicit interpretable image perturbations.",4 "load balanced gans multi-view face image synthesis. multi-view face synthesis single image ill-posed problem often suffers serious appearance distortion. producing photo-realistic identity preserving multi-view results still well defined synthesis problem. paper proposes load balanced generative adversarial networks (lb-gan) precisely rotate yaw angle input face image specified angle. lb-gan decomposes challenging synthesis problem two well constrained subtasks correspond face normalizer face editor respectively. normalizer first frontalizes input image, editor rotates frontalized image desired pose guided remote code. order generate photo-realistic local details, normalizer editor trained two-stage manner regulated conditional self-cycle loss attention based l2 loss. exhaustive experiments controlled uncontrolled environments demonstrate proposed method improves visual realism multi-view synthetic images, also preserves identity information well.",4 "unbeatable imitation. show many classes symmetric two-player games, simple decision rule ""imitate-the-best"" hardly beaten decision rule. provide necessary sufficient conditions imitation unbeatable show beaten much games rock-scissors-paper variety. thus, many interesting examples, like 2x2 games, cournot duopoly, price competition, rent seeking, public goods games, common pool resource games, minimum effort coordination games, arms race, search, bargaining, etc., imitation cannot beaten much even clever opponent.",4 "self-contained easily accessible discussion method descente infinie fermat's explicitly known proof descente infinie. present proof pierre fermat descente infinie known exist today. text latin original requires active mathematical interpretation, proof sketch proper mathematical proof. discuss descente infinie mathematical, logical, historical, linguistic, refined logic-historical points view. provide required preliminaries number theory develop self-contained proof modern form, nevertheless intended follow fermat's ideas closely. annotate english translation fermat's original proof terms modern proof. including important facts, present concise self-contained discussion fermat's proof sketch, easily accessible laymen number theory well laymen history mathematics, provides new clarification method descente infinie experts fields. last least, paper fills gap regarding easy accessibility subject.",4 "look wider match image patches convolutional neural networks. human matches two images, viewer natural tendency view wide area around target pixel obtain clues right correspondence. however, designing matching cost function works large window way difficult. cost function typically intelligent enough discard information irrelevant target pixel, resulting undesirable artifacts. paper, propose novel learn stereo matching cost large-sized window. unlike conventional pooling layers strides, proposed per-pixel pyramid-pooling layer cover large area without loss resolution detail. therefore, learned matching cost function successfully utilize information large area without introducing fattening effect. proposed method robust despite presence weak textures, depth discontinuity, illumination, exposure difference. proposed method achieves near-peak performance middlebury benchmark.",4 "lexicon integrated cnn models attention sentiment analysis. advent word embeddings, lexicons longer fully utilized sentiment analysis although still provide important features traditional setting. paper introduces novel approach sentiment analysis integrates lexicon embeddings attention mechanism convolutional neural networks. approach performs separate convolutions word lexicon embeddings provides global view document using attention. models experimented semeval'16 task 4 dataset stanford sentiment treebank, show comparative better results existing state-of-the-art systems. analysis shows lexicon embeddings allow build high-performing models much smaller word embeddings, attention mechanism effectively dims noisy words sentiment analysis.",4 "neural network architecture optimization submodularity supermodularity. deep learning models' architectures, including depth width, key factors influencing models' performance, test accuracy computation time. paper solves two problems: given computation time budget, choose architecture maximize accuracy, given accuracy requirement, choose architecture minimize computation time. convert architecture optimization subset selection problem. accuracy's submodularity computation time's supermodularity, propose efficient greedy optimization algorithms. experiments demonstrate algorithm's ability find accurate models faster models. analyzing architecture evolution growing time budget, discuss relationships among accuracy, time architecture, give suggestions neural network architecture design.",19 "multi-view regularized gaussian processes. gaussian processes (gps) proven powerful tools various areas machine learning. however, applications gps scenario multi-view learning. paper, present new gp model multi-view learning. unlike existing methods, combines multiple views regularizing marginal likelihood consistency among posterior distributions latent functions different views. moreover, give general point selection scheme multi-view learning improve proposed model criterion. experimental results multiple real world data sets verified effectiveness proposed model witnessed performance improvement employing novel point selection scheme.",19 "shadow estimation method ""the episolar constraint: monocular shape shadow correspondence"". recovering shadows important step many vision algorithms. current approaches work time-lapse sequences limited simple thresholding heuristics. show approaches work careful tuning parameters, work well long-term time-lapse sequences taken span many months. introduce parameter-free expectation maximization approach simultaneously estimates shadows, albedo, surface normals, skylight. approach accurate previous methods, works short long sequences, robust effects nonlinear camera response. finally, demonstrate shadow masks derived algorithm substantially improve performance sun-based photometric stereo compared earlier shadow mask estimation.",4 "zero-shot learning via category-specific visual-semantic mapping. zero-shot learning (zsl) aims classify test instance unseen category based training instances seen categories, gap seen categories unseen categories generally bridged via visual-semantic mapping low-level visual feature space intermediate semantic space. however, visual-semantic mapping learnt based seen categories may generalize well unseen categories data distributions seen categories unseen categories considerably different, known projection domain shift problem zsl. address domain shift issue, propose method named adaptive embedding zsl (aezsl) learn adaptive visual-semantic mapping unseen category based similarities unseen category seen categories. then, make two extensions based aezsl method. firstly, order utilize unlabeled test instances unseen categories, extend aezsl semi-supervised approach named aezsl label refinement (aezsl_lr), progressive approach developed update visual classifiers refine predicted test labels alternatively based similarities among test instances among unseen categories. secondly, avoid learning visual-semantic mapping unseen category large-scale classification task, extend aezsl deep adaptive embedding model named deep aezsl (daezsl) sharing similar idea (i.e., visual-semantic mapping category-specific related semantic space) aezsl, needs trained once, applied arbitrary number unseen categories. extensive experiments demonstrate proposed methods achieve state-of-the-art results image classification four benchmark datasets.",4 "quality resilient deep neural networks. study deep neural networks classification images quality distortions. first show networks fine-tuned distorted data greatly outperform original networks tested distorted data. however, fine-tuned networks perform poorly quality distortions trained for. propose mixture experts ensemble method robust different types distortions. ""experts"" model trained particular type distortion. output model weighted sum expert models, weights determined separate gating network. gating network trained predict optimal weights particular distortion type level. testing, network blind distortion level type, yet still assign appropriate weights expert models. additionally investigate weight sharing methods mixture model show improved performance achieved large reduction number unique network parameters.",4 "robustfill: neural program learning noisy i/o. problem automatically generating computer program specification studied since early days ai. recently, two competing approaches automatic program learning received significant attention: (1) neural program synthesis, neural network conditioned input/output (i/o) examples learns generate program, (2) neural program induction, neural network generates new outputs directly using latent program representation. here, first time, directly compare approaches large-scale, real-world learning task. additionally contrast rule-based program synthesis, uses hand-crafted semantics guide program generation. neural models use modified attention rnn allow encoding variable-sized sets i/o pairs. best synthesis model achieves 92% accuracy real-world test set, compared 34% accuracy previous best neural synthesis approach. synthesis model also outperforms comparable induction model task, importantly demonstrate strength approach highly dependent evaluation metric end-user application. finally, show train neural models remain robust type noise expected real-world data (e.g., typos), highly-engineered rule-based system fails entirely.",4 "multiset ordering constraints. identify new important global (or non-binary) constraint. constraint ensures values taken two vectors variables, viewed multisets, ordered. constraint useful number different applications including breaking symmetry fuzzy constraint satisfaction. propose implement efficient linear time algorithm enforcing generalised arc consistency multiset ordering constraint. experimental results several problem domains show considerable promise.",4 tensor2tensor neural machine translation. tensor2tensor library deep learning models well-suited neural machine translation includes reference implementation state-of-the-art transformer model.,4 "closed-form solution rotation matrix arising computer vision problems. show closed-form solution maximization trace(a'r), given r unknown rotation matrix. problem occurs many computer vision tasks involving optimal rotation matrix estimation. solution continuously reinvented different fields part specific problems. summarize historical evolution problem present general proof solution. contribute proof considering degenerate cases discuss uniqueness r.",4 "decision support systems (dss) construction tendering processes. successful execution construction project heavily impacted making right decision tendering processes. managing tender procedures complex uncertain involving coordination many tasks individuals different priorities objectives. bias inconsistent decision inevitable decision-making process totally depends intuition, subjective judgement emotion. making transparent decision healthy competition tendering, exists need flexible guidance tool decision support. aim paper give review current practices decision support systems (dss) technology construction tendering processes. current practices general tendering processes applied countries different regions united states, europe, middle east asia comprehensively discussed. applications web-based tendering processes also summarised terms properties. besides that, summary decision support system (dss) components included next section. furthermore, prior researches implementation dss approaches tendering processes discussed details. current issues arise paper-based web-based tendering processes outlined. finally, conclusion included end paper.",4 "kernel test goodness fit. propose nonparametric statistical test goodness-of-fit: given set samples, test determines likely generated target density function. measure goodness-of-fit divergence constructed via stein's method using functions reproducing kernel hilbert space. test statistic based empirical estimate divergence, taking form v-statistic terms log gradients target density kernel. derive statistical test, i.i.d. non-i.i.d. samples, estimate null distribution quantiles using wild bootstrap procedure. apply test quantifying convergence approximate markov chain monte carlo methods, statistical model criticism, evaluating quality fit vs model complexity nonparametric density estimation.",19 "efficient trimmed convolutional arithmetic encoding lossless image compression. arithmetic encoding essential class coding techniques widely used various data compression systems exhibited promising performance. one key issue arithmetic encoding method predict probability current symbol encoded context, i.e., preceding encoded symbols, usually executed building look-up table (lut). however, complexity lut increases exponentially length context. thus, solutions limited modeling large context, inevitably restricts compression performance. several recent convolutional neural network (cnn) recurrent neural network (rnn)-based solutions developed account large context, still costly computation. inefficiency existing methods mainly attributed probability prediction performed independently neighboring symbols, actually efficiently conducted shared computation. end, propose trimmed convolutional network arithmetic encoding (tcae) model large context maintaining computational efficiency. trimmed convolution, convolutional kernels specially trimmed respect compression order context dependency input symbols. benefited trimmed convolution, probability prediction symbols efficiently performed one single forward pass via fully convolutional network. experiments show tcae attains better compression ratio lossless gray image compression, adopted cnn-based lossy image compression achieve state-of-the-art rate-distortion performance real-time encoding speed.",4 "solution crime scene reconstruction using time-of-flight cameras. work, propose method three-dimensional (3d) reconstruction wide crime scene, based simultaneous localization mapping (slam) approach. used kinect v2 time-of-flight (tof) rgb-d camera provide colored dense point clouds 30 hz frequency. device moved freely (6 degrees freedom) scene exploration. implemented slam solution aligns successive point clouds using 3d keypoints description matching approach. type approach exploits colorimetric geometrical information, permits reconstruction poor illumination conditions. solution tested indoor crime scene outdoor archaeological site reconstruction, returning mean error around one centimeter. less precise environmental laser scanner solution, practical portable well less cumbersome. also, hardware definitively cheaper.",4 "deep learning rf sub-sampled b-mode ultrasound imaging. portable, three dimensional, ultra-fast ultrasound (us) imaging systems, increasing need reconstruct high quality images limited number rf data receiver (rx) scan-line (sc) sub-sampling. however, due severe side lobe artifacts rf sub-sampling, standard beam-former often produces blurry images less contrast suitable diagnostic purpose. address problem, researchers studied compressed sensing (cs) exploit sparsity image rf data domains. however, existing cs approaches require either hardware changes computationally expensive algorithms. overcome limitations, propose novel deep learning approach directly interpolates missing rf data utilizing redundancy rx-sc plane. particular, network design principle derives novel interpretation deep neural network cascaded convolution framelets learns data-driven bases hankel matrix decomposition. extensive experimental results sub-sampled rf data real us system confirmed proposed method effectively reduce data rate without sacrificing image quality.",4 "scenarios: new representation complex scene understanding. ability computational agents reason high-level content real world scene images important many applications. existing attempts addressing problem complex scene understanding lack representational power, efficiency, ability create robust meta-knowledge scenes. paper, introduce scenarios new way representing scenes. scenario simple, low-dimensional, data-driven representation consisting sets frequently co-occurring objects useful wide range scene understanding tasks. learn scenarios data using novel matrix factorization method integrate new neural network architecture, scenarionet. using scenarionet, recover semantic information real world scene images three levels granularity: 1) scene categories, 2) scenarios, 3) objects. training single scenarionet model enables us perform scene classification, scenario recognition, multi-object recognition, content-based scene image retrieval, content-based image comparison. addition solving many tasks single, unified framework, scenarionet computationally efficient cnns requires significantly fewer parameters achieving similar performance benchmark tasks interpretable produces explanations making decisions. validate utility scenarios scenarionet diverse set scene understanding tasks several benchmark datasets.",4 "camera calibration global constraints motion silhouettes. address problem epipolar geometry using motion silhouettes. methods match epipolar lines frontier points across views, used set putative correspondences. introduce approach improves two orders magnitude performance state-of-the-art methods, significantly reducing number outliers putative matching. model frontier points' correspondence problem constrained flow optimization, requiring small differences coordinates consecutive frames. approach formulated linear integer program show due nature problem, solved efficiently iterative manner. method validated four standard datasets providing accurate calibrations across different viewpoints.",4 "integrated approach crowd video analysis: tracking multi-level activity recognition. present integrated framework simultaneous tracking, group detection multi-level activity recognition crowd videos. instead solving problems independently sequentially, solve together unified framework utilize strong correlation exists among individual motion, groups, activities. explore hierarchical structure hidden video connects individuals time produce tracks, connects individuals form groups also connects groups together form crowd. show estimation hidden structure corresponds track association group detection. estimate hidden structure linear programming formulation. obtained graphical representation explored recognize node values corresponds multi-level activity recognition. problem solved structured svm framework. results publicly available dataset show competitive performance levels granularity state-of-the-art batch processing methods despite proposed technique online (causal) one.",4 "new belief markov chain model application inventory prediction. markov chain model widely applied many fields, especially field prediction. classical discrete-time markov chain(dtmc) widely used method prediction. however, classical dtmc model limitation system complex uncertain information state space discrete. address it, new belief markov chain model proposed combining dempster-shafer evidence theory markov chain. model, uncertain data allowed handle form interval number basic probability assignment(bpa) generated based distance interval numbers. new belief markov chain model overcomes shortcomings classical markov chain efficient ability dealing uncertain information. moreover, example inventory prediction comparison model classical dtmc model show effectiveness rationality proposed model.",4 "parameter selection particle swarm optimization transportation network design problem. transportation planning development, transport network design problem seeks optimize specific objectives (e.g. total travel time) choosing among given set projects keeping consumption resources (e.g. budget) within limits. due numerous cases choosing projects, solving problem difficult time-consuming. based particle swarm optimization (pso) technique, heuristic solution algorithm bi-level problem designed. paper evaluates algorithm performance response changing certain basic pso parameters.",12 "analyzing classifiers: fisher vectors deep neural networks. fisher vector classifiers deep neural networks (dnns) popular successful algorithms solving image classification problems. however, generally considered `black box' predictors non-linear transformations involved far prevented transparent interpretable reasoning. recently, principled technique, layer-wise relevance propagation (lrp), developed order better comprehend inherent structured reasoning complex nonlinear classification models bag feature models dnns. paper (1) extend lrp framework also fisher vector classifiers use analysis tool (2) quantify importance context classification, (3) qualitatively compare dnns fv classifiers terms important image regions (4) detect potential flaws biases data. experiments performed pascal voc 2007 data set.",4 "clickbait detection tweets using self-attentive network. clickbait detection tweets remains elusive challenge. paper, describe solution zingel clickbait detector clickbait challenge 2017, capable evaluating tweet's level click baiting. first reformat regression problem multi-classification problem, based annotation scheme. perform multi-classification, apply token-level, self-attentive mechanism hidden states bi-directional gated recurrent units (bigru), enables model generate tweets' task-specific vector representations attending important tokens. self-attentive neural network trained end-to-end, without involving manual feature engineering. detector ranked first final evaluation clickbait challenge 2017.",4 "real-time hand tracking occlusion egocentric rgb-d sensor. present approach real-time, robust accurate hand pose estimation moving egocentric rgb-d cameras cluttered real environments. existing methods typically fail hand-object interactions cluttered scenes imaged egocentric viewpoints, common virtual augmented reality applications. approach uses two subsequently applied convolutional neural networks (cnns) localize hand regress 3d joint locations. hand localization achieved using cnn estimate 2d position hand center input, even presence clutter occlusions. localized hand position, together corresponding input depth value, used generate normalized cropped image fed second cnn regress relative 3d hand joint locations real time. added accuracy, robustness temporal stability, refine pose estimates using kinematic pose tracking energy. train cnns, introduce new photorealistic dataset uses merged reality approach capture synthesize large amounts annotated data natural hand interaction cluttered scenes. quantitative qualitative evaluation, show method robust self-occlusion occlusions objects, particularly moving egocentric perspectives.",4 "generalized loop correction method approximate inference graphical models. belief propagation (bp) one popular methods inference probabilistic graphical models. bp guaranteed return correct answer tree structures, incorrect non-convergent loopy graphical models. recently, several new approximate inference algorithms based cavity distribution proposed. methods account effect loops incorporating dependency bp messages. alternatively, region-based approximations (that lead methods generalized belief propagation) improve upon bp considering interactions within small clusters variables, thus taking small loops within clusters account. paper introduces approach, generalized loop correction (glc), benefits types loop correction. show glc relates two families inference methods, provide empirical evidence glc works effectively general, significantly accurate correction schemes.",4 "scalable multilabel prediction via randomized methods. modeling dependence outputs fundamental challenge multilabel classification. work show generic regularized nonlinearity mapping independent predictions joint predictions sufficient achieve state-of-the-art performance variety benchmark problems. crucially, compute joint predictions without ever obtaining independent predictions, incorporating low-rank smoothness regularization. achieve leveraging randomized algorithms matrix decomposition kernel approximation. furthermore, techniques applicable multiclass setting. apply method variety multiclass multilabel data sets, obtaining state-of-the-art results.",4 "video paragraph captioning using hierarchical recurrent neural networks. present approach exploits hierarchical recurrent neural networks (rnns) tackle video captioning problem, i.e., generating one multiple sentences describe realistic video. hierarchical framework contains sentence generator paragraph generator. sentence generator produces one simple short sentence describes specific short video interval. exploits temporal- spatial-attention mechanisms selectively focus visual elements generation. paragraph generator captures inter-sentence dependency taking input sentential embedding produced sentence generator, combining paragraph history, outputting new initial state sentence generator. evaluate approach two large-scale benchmark datasets: youtubeclips tacos-multilevel. experiments demonstrate approach significantly outperforms current state-of-the-art methods bleu@4 scores 0.499 0.305 respectively.",4 frame interpretation validation open domain dialogue system. goal paper establish means dialogue platform able cope open domains considering possible interaction embodied agent humans. end present algorithm capable processing natural language utterances validate knowledge structures intelligent agent's mind. algorithm leverages dialogue techniques order solve ambiguities acquire knowledge unknown entities.,4 "class-splitting generative adversarial networks. generative adversarial networks (gans) produce systematically better quality samples class label information provided., i.e. conditional gan setup. still observed recently proposed wasserstein gan formulation stabilized adversarial training allows considering high capacity network architectures resnet. work show boost conditional gan augmenting available class labels. new classes come clustering representation space learned gan model. proposed strategy also feasible class information available, i.e. unsupervised setup. generated samples reach state-of-the-art inception scores cifar-10 stl-10 datasets supervised unsupervised setup.",19 "novel clustering algorithm based modified model random walk. introduce modified model random walk, develop two novel clustering algorithms based it. algorithms, data point dataset considered particle move random space according preset rules modified model. further, data point may also viewed local control subsystem, controller adjusts transition probability vector terms feedbacks data points, transition direction identified event-generating function. finally, positions data points updated. move space, data points collect gradually separating parts emerge among automatically. consequence, data points belong class located position, whereas belong different classes away one another. moreover, experimental results demonstrated data points test datasets clustered reasonably efficiently, comparison algorithms also provides indication effectiveness proposed algorithms.",4 "effective warm start online actor-critic reinforcement learning based mhealth intervention. online reinforcement learning (rl) increasingly popular personalized mobile health (mhealth) intervention. able personalize type dose interventions according user's ongoing statuses changing needs. however, beginning online learning, usually samples support rl updating, leads poor performances. delay good performance online learning algorithms especially detrimental mhealth, users tend quickly disengage mhealth app. address problem, propose new online rl methodology focuses effective warm start. main idea make full use data accumulated decision rule achieved former study. result, greatly enrich data size beginning online learning method. case accelerates online learning process new users achieve good performances beginning online learning also whole online learning process. besides, use decision rules achieved previous study initialize parameter online rl model new users. provides good initialization proposed online rl algorithm. experiment results show promising improvements achieved method compared state-of-the-art method.",4 "using first-order probability logic construction bayesian networks. present mechanism constructing graphical models, specifically bayesian networks, knowledge base general probabilistic information. unique feature approach uses powerful first-order probabilistic logic expressing general knowledge base. logic allows representation wide range logical probabilistic information. model construction procedure propose uses notions direct inference identify pieces local statistical information knowledge base appropriate particular event want reason about. pieces composed generate joint probability distribution specified bayesian network. although fundamental difficulties dealing fully general knowledge, procedure practical quite rich knowledge bases supports construction far wider range networks allowed current template technology.",4 "abducing compliance incomplete event logs. capability store data business processes execution so-called event logs brought diffusion tools analysis process executions assessment goodness process model. nonetheless, tools often rigid dealing event logs include incomplete information process execution. thus, ability handling incomplete event data one challenges mentioned process mining manifesto, evaluation compliance execution trace still requires end-to-end complete trace performed. paper exploits power abduction provide flexible, yet computationally effective, framework deal different forms incompleteness event log. moreover proposes refinement classical notion compliance strong conditional compliance take account incomplete logs. finally, performances evaluation experimental setting shows feasibility presented approach.",4 "innovative texture database collecting approach feature extraction method based combination gray tone difference matrixes, local binary patterns,and k-means clustering. texture analysis classification problems paid much attention image processing scientists since late 80s. texture analysis done accurately, used many cases object tracking, visual pattern recognition, face recognition.since now, many methods offered solve problem. technical differences, used popular databases evaluate performance asbrodatz outex, may made performance biased databases. paper, approach proposed collect efficient databases texture images. proposed approach included two stages. first one developing feature representation based gray tone difference matrixes local binary patterns features next one consisted innovative algorithm based k-means clustering collect images based evaluated features. order evaluate performance proposed approach, texture database collected fisher rate computed collected one well known databases. also, texture classification evaluated based offered feature extraction accuracy compared state art texture classification methods.",4 "efficient evolutionary algorithm single-objective bilevel optimization. bilevel optimization problems class challenging optimization problems, contain two levels optimization tasks. problems, optimal solutions lower level problem become possible feasible candidates upper level problem. requirement makes optimization problem difficult solve, kept researchers busy towards devising methodologies, efficiently handle problem. despite efforts, hardly exists effective methodology, capable handling complex bilevel problem. paper, introduce bilevel evolutionary algorithm based quadratic approximations (bleaq) optimal lower level variables respect upper level variables. approach capable handling bilevel problems different kinds complexities relatively smaller number function evaluations. ideas classical optimization hybridized evolutionary methods generate efficient optimization algorithm generic bilevel problems. efficacy algorithm shown two sets test problems. first set recently proposed smd test set, contains problems controllable complexities, second set contains standard test problems collected literature. proposed method evaluated two benchmarks, performance gain observed significant.",4 "risk-constrained reinforcement learning percentile risk criteria. many sequential decision-making problems one interested minimizing expected cumulative cost taking account \emph{risk}, i.e., increased awareness events small probability high consequences. accordingly, objective paper present efficient reinforcement learning algorithms risk-constrained markov decision processes (mdps), risk represented via chance constraint constraint conditional value-at-risk (cvar) cumulative cost. collectively refer problems percentile risk-constrained mdps. specifically, first derive formula computing gradient lagrangian function percentile risk-constrained mdps. then, devise policy gradient actor-critic algorithms (1) estimate gradient, (2) update policy descent direction, (3) update lagrange multiplier ascent direction. algorithms prove convergence locally optimal policies. finally, demonstrate effectiveness algorithms optimal stopping problem online marketing application.",4 "lamarckism mechanism synthesis: approaching constrained optimization ideas biology. nonlinear constrained optimization problems encountered many scientific fields. utilize huge calculation power current computers, many mathematic models also rebuilt optimization problems. constrained conditions need handled. borrowing biological concepts, study accomplished dealing constraints synthesis four-bar mechanism. biologically regarding constrained condition form selection characteristics population, four new algorithms proposed, new explanation given penalty method. using algorithms, three cases tested differential-evolution based programs. better, comparable, results show presented algorithms methodology may become common means constraint handling optimization problems.",12 "online handwritten devanagari stroke recognition using extended directional features. paper describes new feature set, called extended directional features (edf) use recognition online handwritten strokes. use edf specifically recognize strokes form basis producing devanagari script, widely used indian language script. noted stroke recognition handwritten script equivalent phoneme recognition speech signals generally poor order 20% singing voice. experiments conducted automatic recognition isolated handwritten strokes. initially describe proposed feature set, namely edf show feature effectively utilized writer independent script recognition stroke recognition. experimental results show extended directional feature set performs well 65+% stroke level recognition accuracy writer independent data set.",4 "enhanced neural machine translation learning draft. neural machine translation (nmt) recently achieved impressive results. potential problem existing nmt algorithm, however, decoding conducted left right, without considering right context. paper proposes two-stage approach solve problem. first stage, conventional attention-based nmt system used produce draft translation, second stage, novel double-attention nmt system used refine translation, looking original input well draft translation. drafting-and-refinement obtain right-context information draft, hence producing consistent translations. evaluated approach using two chinese-english translation tasks, one 44k pairs 1m pairs respectively. experiments showed approach achieved positive improvements conventional nmt system: improvements 2.4 0.9 bleu points small-scale large-scale tasks, respectively.",4 "variational approach consistency spectral clustering. paper establishes consistency spectral approaches data clustering. consider clustering point clouds obtained samples ground-truth measure. graph representing point cloud obtained assigning weights edges based distance points connect. investigate spectral convergence unnormalized normalized graph laplacians towards appropriate operators continuum domain. obtain sharp conditions connectivity radius scaled respect number sample points spectral convergence hold. also show discrete clusters obtained via spectral clustering converge towards continuum partition ground truth measure. continuum partition minimizes functional describing continuum analogue graph-based spectral partitioning. approach, based variational convergence, general flexible.",12 "synthesizing robust plans incomplete domain models. current planners assume complete domain models focus generating correct plans. unfortunately, domain modeling laborious error-prone task. domain experts cannot guarantee completeness, often able circumscribe incompleteness model providing annotations parts domain model may incomplete. cases, goal generate plans robust respect known incompleteness domain. paper, first introduce annotations expressing knowledge domain incompleteness, formalize notion plan robustness respect incomplete domain model. propose approach compiling problem finding robust plans conformant probabilistic planning problem. present experimental results probabilistic-ff, state-of-the-art planner, showing promise approach.",4 "bayesian network approximation edge deletion. consider problem deleting edges bayesian network purpose simplifying models probabilistic inference. particular, propose new method deleting network edges, based evidence hand. provide interesting bounds kl-divergence original approximate networks, highlight impact given evidence quality approximation shed light good bad candidates edge deletion. finally demonstrate empirically promise proposed edge deletion technique basis approximate inference.",4 "automated oct segmentation images dme. paper presents novel automated system segments six sub-retinal layers optical coherence tomography (oct) image stacks healthy patients patients diabetic macular edema (dme). first, image oct stack denoised using wiener deconvolution algorithm estimates additive speckle noise variance using novel fourier-domain based structural error. denoising method enhances image snr average 12db. next, denoised images subjected iterative multi-resolution high-pass filtering algorithm detects seven sub-retinal surfaces six iterative steps. thicknesses sub-retinal layer scans particular oct stack compared manually marked groundtruth. proposed system uses adaptive thresholds denoising segmenting image hence robust disruptions retinal micro-structure due dme. proposed denoising segmentation system average error 1.2-5.8 $\mu m$ 3.5-26$\mu m$ segmenting sub-retinal surfaces normal abnormal images dme, respectively. estimating sub-retinal layer thicknesses, proposed system average error 0.2-2.5 $\mu m$ 1.8-18 $\mu m$ normal abnormal images, respectively. additionally, average inner sub-retinal layer thickness abnormal images estimated 275$\mu (r=0.92)$ average error 9.3 $\mu m$, average thickness outer layers abnormal images estimated 57.4$\mu (r=0.74)$ average error 3.5 $\mu m$. proposed system useful tracking disease progression dme period time.",4 "intent inference syntactic tracking gmti measurements. conventional target tracking systems, human operators use estimated target tracks make higher level inference target behaviour/intent. paper develops syntactic filtering algorithms assist human operators extracting spatial patterns target tracks identify suspicious/anomalous spatial trajectories. targets' spatial trajectories modeled stochastic context free grammar (scfg) switched mode state space model. bayesian filtering algorithms stochastic context free grammars presented extracting syntactic structure illustrated ground moving target indicator (gmti) radar example. performance algorithms tested experimental data collected using drdc ottawa's x-band wideband experimental airborne radar (xwear).",19 "convex similarity index sparse recovery missing image samples. paper investigates problem recovering missing samples using methods based sparse representation adapted especially image signals. instead $l_2$-norm mean square error (mse), new perceptual quality measure used similarity criterion original reconstructed images. proposed criterion called convex similarity (csim) index modified version structural similarity (ssim) index, despite predecessor, convex uni-modal. derive mathematical properties proposed index show optimally choose parameters proposed criterion, investigating restricted isometry (rip) error-sensitivity properties. also propose iterative sparse recovery method based constrained $l_1$-norm minimization problem, incorporating csim fidelity criterion. resulting convex optimization problem solved via algorithm based alternating direction method multipliers (admm). taking advantage convexity csim index, also prove convergence algorithm globally optimal solution proposed optimization problem, starting arbitrary point. simulation results confirm performance new similarity index well proposed algorithm missing sample recovery image patch signals.",4 "optimal release time decision fuzzy mathematical programming perspective. demand high software reliability requires rigorous testing followed requirement robust modeling techniques software quality prediction. one side, firms steadily manage reliability testing vigorously, optimal release time determination biggest concern. past many models developed much research devoted towards assessment release time software. however, majority work deals crisp study. paper addresses problem release time prediction using fuzzy logic. formulated fuzzy release time problem considering cost testing impact warranty period. results show fuzzy model good adaptability.",4 "image forensics: detecting duplication scientific images manipulation-invariant image similarity. manipulation re-use images scientific publications concerning problem currently lacks scalable solution. current tools detecting image duplication mostly manual semi-automated, despite availability overwhelming target dataset learning-based approach. paper addresses problem determining if, given two images, one manipulated version means copy, rotation, translation, scale, perspective transform, histogram adjustment, partial erasing. propose data-driven solution based 3-branch siamese convolutional neural network. convnet model trained map images 128-dimensional space, euclidean distance duplicate images smaller equal 1, distance unique images greater 1. results suggest approach potential improve surveillance published in-peer-review literature image manipulation.",4 "deep detection people mobility aids hospital robot. robots operating populated environments encounter many different types people, might advanced need cautious interaction, physical impairments advanced age. robots therefore need recognize advanced demands provide appropriate assistance, guidance forms support. paper, propose depth-based perception pipeline estimates position velocity people environment categorizes according mobility aids use: pedestrian, person wheelchair, person wheelchair person pushing them, person crutches person using walker. present fast region proposal method feeds region-based convolutional network (fast r-cnn). this, speed object detection process factor seven compared dense sliding window approach. furthermore propose probabilistic position, velocity class estimator smooth cnn's detections account occlusions misclassifications. addition, introduce new hospital dataset 17,000 annotated rgb-d images. extensive experiments confirm pipeline successfully keeps track people mobility aids, even challenging situations multiple people different categories frequent occlusions. videos experiments dataset available http://www2.informatik.uni-freiburg.de/~kollmitz/mobilityaids",4 "sar image segmentation using vector quantization technique entropy images. development application various remote sensing platforms result production huge amounts satellite image data. therefore, increasing need effective querying browsing image databases. order take advantage make good use satellite images data, must able extract meaningful information imagery. hence proposed new algorithm sar image segmentation. paper propose segmentation using vector quantization technique entropy image. initially, obtain entropy image second step use kekre's fast codebook generation (kfcg) algorithm segmentation entropy image. thereafter, codebook size 128 generated entropy image. code vectors clustered 8 clusters using kfcg algorithm converted 8 images. 8 images displayed result. approach lead segmentation segmentation. compared results well known gray level co-occurrence matrix. proposed algorithm gives better segmentation less complexity.",4 "belief optimization binary networks: stable alternative loopy belief propagation. present novel inference algorithm arbitrary, binary, undirected graphs. unlike loopy belief propagation, iterates fixed point equations, directly descend bethe free energy. algorithm consists two phases, first update pairwise probabilities, given marginal probabilities unit,using analytic expression. next, update marginal probabilities, given pairwise probabilities following negative gradient bethe free energy. steps guaranteed decrease bethe free energy, since lower bounded, algorithm guaranteed converge local minimum. also show bethe free energy equal tap free energy second order weights. experiments confirm belief propagation converges usually finds identical solutions belief optimization method. however, cases belief propagation fails converge, belief optimization continues converge reasonable beliefs. stable nature belief optimization makes ideally suited learning graphical models data.",4 "improving weather radar fusion classification. air traffic management (atm) necessary operations (tactical planing, sector configuration, required staffing, runway configuration, routing approaching aircrafts) rely accurate measurements predictions current weather situation. essential basis information delivered weather radar images (wxr), which, unfortunately, exhibit vast amount disturbances. thus, improvement datasets key factor accurate predictions weather phenomena weather conditions. image processing methods based texture analysis geometric operators allow identify regions including artefacts well zones missing information. correction zones implemented exploiting multi-spectral satellite data (meteosat second generation). results prove proposed system artefact detection data correction significantly improves quality wxr data and, thus, enables reliable weather now- forecast leading increased atm safety.",4 "demystifying deep learning: geometric approach iterative projections. parametric approaches learning, deep learning (dl), highly popular nonlinear regression, spite extremely difficult training increasing complexity (e.g. number layers dl). paper, present alternative semi-parametric framework foregoes ordinarily required feedback, introducing novel idea geometric regularization. show certain deep learning techniques residual network (resnet) architecture closely related approach. hence, technique used analyze types deep learning. moreover, present preliminary results confirm approach easily trained obtain complex structures.",4 "short communication quist: quick clustering algorithm. short communication introduce quick clustering algorithm (quist), efficient hierarchical clustering algorithm based sorting. quist poly-logarithmic divisive clustering algorithm assume number clusters, and/or cluster size known ahead time. also insensitive original ordering input.",4 "improve sat-solving machine learning. project, aimed improve runtime minisat, conflict-driven clause learning (cdcl) solver solves propositional boolean satisfiability (sat) problem. first used logistic regression model predict satisfiability propositional boolean formulae fixing values certain fraction variables formula. applied logistic model added preprocessing period minisat determine preferable initial value (either true false) boolean variable using monte-carlo approach. concretely, monte-carlo trial, fixed values certain ratio randomly selected variables, calculated confidence resulting sub-formula satisfiable logistic regression model. initial value variable set based mean confidence scores trials started literals variable. particularly interested setting initial values backbone variables correctly, variables value solutions sat formula. monte-carlo method able set 78% backbones correctly. excluding preprocessing time, compared default setting minisat, runtime minisat satisfiable formulae decreased 23%. however, method outperform vanilla minisat runtime, decrease conflicts outweighed long runtime preprocessing period.",4 "self-adaptive node-based pca encodings. paper propose algorithm, simple hebbian pca, prove able calculate principal component analysis (pca) distributed fashion across nodes. simplifies existing network structures removing intralayer weights, essentially cutting number weights need trained half.",4 "deep sliding shapes amodal 3d object detection rgb-d images. focus task amodal 3d object detection rgb-d images, aims produce 3d bounding box object metric form full extent. introduce deep sliding shapes, 3d convnet formulation takes 3d volumetric scene rgb-d image input outputs 3d object bounding boxes. approach, propose first 3d region proposal network (rpn) learn objectness geometric shapes first joint object recognition network (orn) extract geometric features 3d color features 2d. particular, handle objects various sizes training amodal rpn two different scales orn regress 3d bounding boxes. experiments show algorithm outperforms state-of-the-art 13.8 map 200x faster original sliding shapes. source code pre-trained models available github.",4 "mixed precision training convolutional neural networks using integer operations. state-of-the-art (sota) mixed precision training dominated variants low precision floating point operations, particular, fp16 accumulating fp32 micikevicius et al. (2017). hand, lot research also happened domain low mixed-precision integer training, works either present results non-sota networks (for instance alexnet imagenet-1k), relatively small datasets (like cifar-10). work, train state-of-the-art visual understanding neural networks imagenet-1k dataset, integer operations general purpose (gp) hardware. particular, focus integer fused-multiply-and-accumulate (fma) operations take two pairs int16 operands accumulate results int32 output.we propose shared exponent representation tensors develop dynamic fixed point (dfp) scheme suitable common neural network operations. nuances developing efficient integer convolution kernel examined, including methods handle overflow int32 accumulator. implement cnn training resnet-50, googlenet-v1, vgg-16 alexnet; networks achieve exceed sota accuracy within number iterations fp32 counterparts without change hyper-parameters 1.8x improvement end-to-end training throughput. best knowledge results represent first int16 training results gp hardware imagenet-1k dataset using sota cnns achieve highest reported accuracy using half-precision",4 "application fuzzy assessing reliability decision making. paper proposes new fuzzy assessing procedure application management decision making. proposed fuzzy approach build membership functions system characteristics standby repairable system. method used extract family conventional crisp intervals fuzzy repairable system desired system characteristics. determined set nonlinear parametric programing using membership functions. system characteristics governed membership functions, information provided use management, redundant system extended fuzzy environment, general repairable systems represented accurately analytic results useful designers practitioners. also beside standby, active redundancy systems used many cases article many practical instances. different studies, model provides, good estimated value based uncertain environments, comparison discussion using fuzzy theory conventional method also comparison parallel (active redundancy) series system fuzzy world standby redundancy. membership function intervals cannot inverted explicitly, system management designers specify system characteristics interest, perform numerical calculations, examine corresponding {\alpha}-cuts, use information develop improve system processes.",4 "local global gestalt laws: neurally based spectral approach. mathematical model figure-ground articulation presented, taking account local global gestalt laws. model compatible functional architecture primary visual cortex (v1). particularly local gestalt law good continuity described means suitable connectivity kernels, derived lie group theory neurally implemented long range connectivity v1. different kernels compatible geometric structure cortical connectivity derived fundamental solutions fokker planck, sub-riemannian laplacian isotropic laplacian equations. kernels used construct matrices connectivity among features present visual stimulus. global gestalt constraints introduced terms spectral analysis connectivity matrix, showing processing cortically implemented v1 mean field neural equations. analysis performs grouping local features individuates perceptual units highest saliency. numerical simulations performed results obtained applying technique number stimuli.",4 "microwave breast cancer detection using empirical mode decomposition features. microwave-based breast cancer detection proposed complementary approach compensate drawbacks existing breast cancer detection techniques. among existing microwave breast cancer detection methods, machine learning-type algorithms recently become popular. focus detecting existence breast tumours rather performing imaging identify exact tumour position. key step machine learning approaches feature extraction. one widely used feature extraction method principle component analysis (pca). however, sensitive signal misalignment. paper presents empirical mode decomposition (emd)-based feature extraction method, robust misalignment. experimental results involving clinical data sets combined numerically simulated tumour responses show combined features emd pca improve detection performance ensemble selection-based classifier.",19 "classification bags, groups sets. many classification problems difficult formulate directly terms traditional supervised setting, training test samples individual feature vectors. cases samples better described sets feature vectors, labels available sets rather individual samples, or, individual labels available, independent. better deal problems, several extensions supervised learning proposed, either training and/or test objects sets feature vectors. however, proposed rather independently other, mutual similarities differences hitherto mapped out. work, provide overview learning scenarios, propose taxonomy illustrate relationships them, discuss directions research areas.",19 "omega model human detection counting application smart surveillance system. driven significant advancements technology social issues security management, strong need smart surveillance system society today. one key features smart surveillance system efficient human detection counting system decide label events own. paper propose new, novel robust model, omega model, detecting counting human beings present scene. proposed model employs set four distinct descriptors identifying unique features head, neck shoulder regions person. unique head neck shoulder signature given omega model exploits challenges inter person variations size shape peoples head, neck shoulder regions achieve robust detection human beings even partial occlusion, dynamically changing background varying illumination conditions. experimentation observe analyze influences four descriptors system performance computation speed conclude weight based decision making system produces best results. evaluation results number images indicate validation method actual situation.",4 "robust efficient transfer learning hidden-parameter markov decision processes. introduce new formulation hidden parameter markov decision process (hip-mdp), framework modeling families related tasks using low-dimensional latent embeddings. new framework correctly models joint uncertainty latent parameters state space. also replace original gaussian process-based model bayesian neural network, enabling scalable inference. thus, expand scope hip-mdp applications higher dimensions complex dynamics.",19 "medical diagnosis laboratory tests combining generative discriminative learning. primary goal computational phenotype research conduct medical diagnosis. hospital, physicians rely massive clinical data make diagnosis decisions, among laboratory tests one important resources. however, longitudinal incomplete nature laboratory test data casts significant challenge interpretation usage, may result harmful decisions human physicians automatic diagnosis systems. work, take advantage deep generative models deal complex laboratory tests. specifically, propose end-to-end architecture involves deep generative variational recurrent neural networks (vrnn) learn robust generalizable features, discriminative neural network (nn) model learn diagnosis decision making, two models trained jointly. experiments conducted dataset involving 46,252 patients, 50 frequent tests used predict 50 common diagnoses. results show model, vrnn+nn, significantly (p<0.001) outperforms baseline models. moreover, demonstrate representations learned joint training informative learned pure generative models. finally, find model offers surprisingly good imputation missing values.",4 "tasselnet: counting maize tassels wild via local counts regression network. accurately counting maize tassels important monitoring growth status maize plants. tedious task, however, still mainly done manual efforts. context modern plant phenotyping, automating task required meet need large-scale analysis genotype phenotype. recent years, computer vision technologies experienced significant breakthrough due emergence large-scale datasets increased computational resources. naturally image-based approaches also received much attention plant-related studies. yet fact image-based systems plant phenotyping deployed controlled laboratory environment. transferring application scenario unconstrained in-field conditions, intrinsic extrinsic variations wild pose great challenges accurate counting maize tassels, goes beyond ability conventional image processing techniques. calls robust computer vision approaches address in-field variations. paper studies in-field counting problem maize tassels. knowledge, first time plant-related counting problem considered using computer vision technologies unconstrained field-based environment.",4 "rho decision-theoretic apparatus dempster-shafer theory. thomas m. strat developed decision-theoretic apparatus dempster-shafer theory (decision analysis using belief functions, intern. j. approx. reason. 4(5/6), 391-417, 1990). apparatus, expected utility intervals constructed different choices. choice highest expected utility preferable others. however, find preferred choice expected utility interval one choice included another, necessary interpolate discerning point intervals. done parameter rho, defined probability ambiguity utility every nonsingleton focal element turn favorable possible. several different decision makers, might sometimes interested highest expected utility among decision makers rather trying maximize expected utility regardless choices made decision makers. preference choice determined probability yielding highest expected utility. probability equal maximal interval length rho alternative preferred. must take account choices already made decision makers also rational choices assume made later decision makers. strats apparatus, assumption, unwarranted evidence hand, made value rho. demonstrate assumption necessary. sufficient assume uniform probability distribution rho able discern preferable choice. discuss approach justifiable.",4 "analyzing sparse dictionaries online learning kernels. many signal processing machine learning methods share essentially linear-in-the-parameter model, many parameters available samples kernel-based machines. sparse approximation essential many disciplines, new challenges emerging online learning kernels. end, several sparsity measures proposed literature quantify sparse dictionaries constructing relevant ones, prolific ones distance, approximation, coherence babel measures. paper, analyze sparse dictionaries based measures. conducting eigenvalue analysis, show sparsity measures share many properties, including linear independence condition inducing well-posed optimization problem. furthermore, prove exists quasi-isometry parameter (i.e., dual) space dictionary's induced feature space.",19 "gtr-model: universal framework quantum-like measurements. present general geometrico-dynamical description physical abstract entities, called 'general tension-reduction' (gtr) model, states, also measurement-interactions represented, associated outcome probabilities calculated. underlying model hypothesis indeterminism manifests consequence unavoidable fluctuations experimental context, accordance 'hidden-measurements interpretation' quantum mechanics. structure state space hilbertian, measurements 'universal' kind, i.e., result average possible ways selecting outcome, gtr-model provides predictions born rule, therefore provides natural completed version quantum mechanics. however, structure state space non-hilbertian and/or possible ways selecting outcome available actualized, predictions model generally differ quantum ones, especially sequential measurements considered. paradigmatic examples discussed, taken physics human cognition. particular attention given known psychological effects, like question order effects response replicability, show able generate non-hilbertian statistics. also suggest realistic interpretation gtr-model, applied human cognition decision, think could become generally adopted interpretative framework quantum cognition research.",18 "temporal convolutional neural networks diagnosis lab tests. early diagnosis treatable diseases essential improving healthcare, many diseases' onsets predictable annual lab tests temporal trends. introduce multi-resolution convolutional neural network early detection multiple diseases irregularly measured sparse lab values. novel architecture takes input imputed version data binary observation matrix. imputing temporal sparse observations, develop flexible, fast train method differentiable multivariate kernel regression. experiments data 298k individuals 8 years, 18 common lab measurements, 171 diseases show temporal signatures learned via convolution significantly predictive baselines commonly used early disease diagnosis.",4 "weighted-svd: matrix factorization weights latent factors. matrix factorization models, sometimes called latent factor models, family methods recommender system research area (1) generate latent factors users items (2) predict users' ratings items based latent factors. however, current matrix factorization models presume latent factors equally weighted, may always reasonable assumption practice. paper, propose new model, called weighted-svd, integrate linear regression model svd model latent factor accompanies corresponding weight parameter. mechanism allows latent factors different weights influence final ratings. complexity weighted-svd model slightly larger svd model much smaller svd++ model. compared weighted-svd model several latent factor models five public datasets based root-mean-squared-errors (rmses). results show weighted-svd model outperforms baseline methods experimental datasets almost settings.",4 "low-rank optimization trace norm penalty. paper addresses problem low-rank trace norm minimization. propose algorithm alternates fixed-rank optimization rank-one updates. fixed-rank optimization characterized efficient factorization makes trace norm differentiable search space computation duality gap numerically tractable. search space nonlinear equipped particular riemannian structure leads efficient computations. present second-order trust-region algorithm guaranteed quadratic rate convergence. overall, proposed optimization scheme converges super-linearly global solution maintaining complexity linear number rows columns matrix. compute set solutions efficiently grid regularization parameters propose predictor-corrector approach outperforms naive warm-restart approach fixed-rank quotient manifold. performance proposed algorithm illustrated problems low-rank matrix completion multivariate linear regression.",12 "systematic review hindi prosody. prosody describes form function sentence using suprasegmental features speech. prosody phenomena explored domain higher phonological constituents word, phonological phrase intonational phrase. study prosody word level called word prosody word level called sentence prosody. word prosody describes stress pattern comparing prosodic features constituent syllables. sentence prosody involves study phrasing pattern intonatonal pattern language. aim study summarize existing works hindi prosody carried different domain language speech processing. review presented systematic fashion could useful resource one wants build existing works.",4 "neural-network techniques visual mining clinical electroencephalograms. chapter describe new neural-network techniques developed visual mining clinical electroencephalograms (eegs), weak electrical potentials invoked brain activity. techniques exploit fruitful ideas group method data handling (gmdh). section 2 briefly describes standard neural-network techniques able learn well-suited classification modes data presented relevant features. section 3 introduces evolving cascade neural network technique adds new input nodes well new neurons network training error decreases. algorithm applied recognize artifacts clinical eegs. section 4 presents gmdh-type polynomial networks learnt data. applied technique distinguish eegs recorded alzheimer healthy patient well recognize eeg artifacts. section 5 describes new neural-network technique developed induce multi-class concepts data. used technique inducing 16-class concept large-scale clinical eeg data. finally discuss perspectives applying neural-network techniques clinical eegs.",4 "online adaptive hidden markov model multi-tracker fusion. paper, propose novel method visual object tracking called hmmtxd. method fuses observations complementary out-of-the box trackers detector utilizing hidden markov model whose latent states correspond binary vector expressing failure individual trackers. markov model trained unsupervised way, relying online learned detector provide source tracker-independent information modified baum- welch algorithm updates model w.r.t. partially annotated data. show effectiveness proposed method combination two three tracking algorithms. performance hmmtxd evaluated two standard benchmarks (cvpr2013 vot) rich collection 77 publicly available sequences. hmmtxd outperforms state-of-the-art, often significantly, datasets almost criteria.",4 "compact kernel approximation 3d action recognition. 3d action recognition shown benefit covariance representation input data (joint 3d positions). kernel machine feed feature effective paradigm 3d action recognition, yielding state-of-the-art results. yet, whole framework affected well-known scalability issue. fact, general, kernel function evaluated pairs instances inducing gram matrix whose complexity quadratic number samples. work reduce complexity linear proposing novel explicit feature map approximate kernel function. allows train linear classifier explicit feature encoding, implicitly implements log-euclidean machine scalable fashion. prove proposed approximation unbiased, also work explicit strong bound variance, attesting theoretical superiority approach respect existing ones. experimentally, verify representation provides compact encoding outperforms approximation schemes number publicly available benchmark datasets 3d action recognition.",4 "identification arabic word bilingual text using character features. identification language script important stage process recognition writing. several works research area, treat various languages. used methods global statistical. present paper, study possibility using features scripts identify language. identification language script characteristics returns identification case multilingual documents less difficult. present work, study possibility using structural features identify arabic language arabic / latin text.",4 "beyond temporal pooling: recurrence temporal convolutions gesture recognition video. recent studies demonstrated power recurrent neural networks machine translation, image captioning speech recognition. task capturing temporal structure video, however, still remain numerous open research questions. current research suggests using simple temporal feature pooling strategy take account temporal aspect video. demonstrate method sufficient gesture recognition, temporal information discriminative compared general video classification tasks. explore deep architectures gesture recognition video propose new end-to-end trainable neural network architecture incorporating temporal convolutions bidirectional recurrence. main contributions twofold; first, show recurrence crucial task; second, show adding temporal convolutions leads significant improvements. evaluate different approaches montalbano gesture recognition dataset, achieve state-of-the-art results.",4 "generalization bounds metric similarity learning. recently, metric learning similarity learning attracted large amount interest. many models optimisation algorithms proposed. however, relatively little work generalization analysis methods. paper, derive novel generalization bounds metric similarity learning. particular, first show generalization analysis reduces estimation rademacher average ""sums-of-i.i.d."" sample-blocks related specific matrix norm. then, derive generalization bounds metric/similarity learning different matrix-norm regularisers estimating specific rademacher complexities. analysis indicates sparse metric/similarity learning $l^1$-norm regularisation could lead significantly better bounds frobenius-norm regularisation. novel generalization analysis develops refines techniques u-statistics rademacher complexity analysis.",4 "team behavior interactive dynamic influence diagrams applications ad hoc teams. planning ad hoc teamwork challenging involves agents collaborating without prior coordination communication. focus principled methods single agent cooperate others. motivates investigating ad hoc teamwork problem context individual decision making frameworks. however, individual decision making multiagent settings faces task reason agents' actions, turn involves reasoning others. established approximation operationalizes approach bound infinite nesting introducing level 0 models. show consequence finitely-nested modeling may obtain optimal team solutions cooperative settings. address limitation including models level 0 whose solutions involve learning. demonstrate learning integrated planning context interactive dynamic influence diagrams facilitates optimal team behavior, applicable ad hoc teamwork.",4 "skeleton key: image captioning skeleton-attribute decomposition. recently, lot interest automatically generating descriptions image. existing language-model based approaches task learn generate image description word word original word order. however, humans, natural locate objects relationships first, elaborate object, describing notable attributes. present coarse-to-fine method decomposes original image description skeleton sentence attributes, generates skeleton sentence attribute phrases separately. decomposition, method generate accurate novel descriptions previous state-of-the-art. experimental results ms-coco larger scale stock3m datasets show algorithm yields consistent improvements across different evaluation metrics, especially spice metric, much higher correlation human ratings conventional metrics. furthermore, algorithm generate descriptions varied length, benefiting separate control skeleton attributes. enables image description generation better accommodates user preferences.",4 "frequency-based patrolling heterogeneous agents limited communication. paper investigates multi-agent frequencybased patrolling intersecting, circle graphs conditions graph nodes non-uniform visitation requirements agents limited ability communicate. task modeled partially observable markov decision process, reinforcement learning solution developed. agent generates policy markov chains, policies exchanged agents occupy adjacent nodes. constraint policy exchange models sparse communication conditions large, unstructured environments. empirical results provide perspectives convergence properties, agent cooperation, generalization learned patrolling policies new instances task. emergent behavior indicates learned coordination strategies heterogeneous agents patrolling large, unstructured regions well ability generalize dynamic variation node visitation requirements.",4 "framework generalizing graph-based representation learning methods. random walks heart many existing deep learning algorithms graph data. however, algorithms many limitations arise use random walks, e.g., features resulting methods unable transfer new nodes graphs tied node identity. work, introduce notion attributed random walks serves basis generalizing existing methods deepwalk, node2vec, many others leverage random walks. proposed framework enables methods widely applicable transductive inductive learning well use graphs attributes (if available). achieved learning functions generalize new nodes graphs. show proposed framework effective average auc improvement 16.1% requiring average 853 times less space existing methods variety graphs several domains.",19 "making life better one large system time: challenges uai research. rapid growth diversity service offerings ensuing complexity information technology ecosystems present numerous management challenges (both operational strategic). instrumentation measurement technology is, large, keeping pace development growth. however, algorithms, tools, technology required transform data relevant information decision making not. claim paper (and invited talk) line research conducted uncertainty artificial intelligence well suited address challenges close gap. support claim discuss open problems using recent examples diagnosis, model discovery, policy optimization three real life distributed systems.",4 "improving decision analytics deep learning: case financial disclosures. decision analytics commonly focuses text mining financial news sources order provide managerial decision support predict stock market movements. existing predictive frameworks almost exclusively apply traditional machine learning methods, whereas recent research indicates traditional machine learning methods sufficiently capable extracting suitable features capturing non-linear nature complex tasks. remedy, novel deep learning models aim overcome issue extending traditional neural network models additional hidden layers. indeed, deep learning shown outperform traditional methods terms predictive performance. paper, adapt novel deep learning technique financial decision support. instance, aim predict direction stock movements following financial disclosures. result, show deep learning outperform accuracy random forests benchmark machine learning 5.66%.",19 "multi-armed bandits unit interval graphs. online learning problem side information similarity dissimilarity across different actions considered. problem formulated stochastic multi-armed bandit problem graph-structured learning space. node graph represents arm bandit problem edge two nodes represents closeness mean rewards. shown resulting graph unit interval graph. hierarchical learning policy developed offers sublinear scaling regret size learning space fully exploiting side information offline reduction learning space online aggregation reward observations similar arms. order optimality proposed policy terms size learning space length time horizon established matching lower bound regret. shown mean rewards bounded, complete learning bounded regret infinite time horizon achieved. extension case partial information arm similarity dissimilarity also discussed.",4 "active learning inverse models intrinsically motivated goal exploration robots. introduce self-adaptive goal generation - robust intelligent adaptive curiosity (sagg-riac) architecture intrinsi- cally motivated goal exploration mechanism allows active learning inverse models high-dimensional redundant robots. allows robot efficiently actively learn distributions parameterized motor skills/policies solve corresponding distribution parameterized tasks/goals. architecture makes robot sample actively novel parameterized tasks task space, based measure competence progress, triggers low-level goal-directed learning motor policy pa- rameters allow solve it. learning generalization, system leverages regression techniques allow infer motor policy parameters corresponding given novel parameterized task, based previously learnt correspondences policy task parameters. present experiments high-dimensional continuous sensorimotor spaces three different robotic setups: 1) learning inverse kinematics highly-redundant robotic arm, 2) learning omnidirectional locomotion motor primitives quadruped robot, 3) arm learning control fishing rod flexible wire. show 1) exploration task space lot faster exploration actuator space learning inverse models redundant robots; 2) selecting goals maximizing competence progress creates developmental trajectories driving robot progressively focus tasks increasing complexity statistically significantly efficient selecting tasks randomly, well efficient different standard active motor babbling methods; 3) architecture allows robot actively discover parts task space learn reach part cannot.",4 "automatic labelling topics neural embeddings. topics generated topic models typically represented list terms. reduce cognitive overhead interpreting topics end-users, propose labelling topic succinct phrase summarises theme idea. using wikipedia document titles label candidates, compute neural embeddings documents words select relevant labels topics. compared state-of-the-art topic labelling system, methodology simpler, efficient, finds better topic labels.",4 "end-to-end convolutional selective autoencoder approach soybean cyst nematode eggs detection. paper proposes novel selective autoencoder approach within framework deep convolutional networks. crux idea train deep convolutional autoencoder suppress undesired parts image frame allowing desired parts resulting efficient object detection. efficacy framework demonstrated critical plant science problem. united states, approximately $1 billion lost per annum due nematode infection soybean plants. currently, plant-pathologists rely labor-intensive time-consuming identification soybean cyst nematode (scn) eggs soil samples via manual microscopy. proposed framework attempts significantly expedite process using series manually labeled microscopic images training followed automated high-throughput egg detection. problem particularly difficult due presence large population non-egg particles (disturbances) image frames similar scn eggs shape, pose illumination. therefore, selective autoencoder trained learn unique features related invariant shapes sizes scn eggs without handcrafting. that, composite non-maximum suppression differencing applied post-processing stage.",4 "extended object tracking: introduction, overview applications. article provides elaborate overview current research extended object tracking. provide clear definition extended object tracking problem discuss delimitation types object tracking. next, different aspects extended object modelling extensively discussed. subsequently, give tutorial introduction two basic well used extended object tracking approaches - random matrix approach kalman filter-based approach star-convex shapes. next part treats tracking multiple extended objects elaborates large number feasible association hypotheses tackled using random finite set (rfs) non-rfs multi-object trackers. article concludes summary current applications, four example applications involving camera, x-band radar, light detection ranging (lidar), red-green-blue-depth (rgb-d) sensors highlighted.",4 "attend interact: higher-order object interactions video understanding. human actions often involve complex interactions across several inter-related objects scene. however, existing approaches fine-grained video understanding visual relationship detection often rely single object representation pairwise object relationships. furthermore, learning interactions across multiple objects hundreds frames video computationally infeasible performance may suffer since large combinatorial space modeled. paper, propose efficiently learn higher-order interactions arbitrary subgroups objects fine-grained video understanding. demonstrate modeling object interactions significantly improves accuracy action recognition video captioning, saving 3-times computation traditional pairwise relationships. proposed method validated two large-scale datasets: kinetics activitynet captions. sinet sinet-caption achieve state-of-the-art performances datasets even though videos sampled maximum 1 fps. best knowledge, first work modeling object interactions open domain large-scale video datasets, additionally model higher-order object interactions improves performance low computational costs.",4 "world graph?. discovering statistical structure links fundamental problem analysis social networks. choosing misspecified model, equivalently, incorrect inference algorithm result invalid analysis even falsely uncover patterns fact artifacts model. work focuses unifying two widely used link-formation models: stochastic blockmodel (sbm) small world (or latent space) model (swm). integrating techniques kernel learning, spectral graph theory, nonlinear dimensionality reduction, develop first statistically sound polynomial-time algorithm discover latent patterns sparse graphs models. network comes sbm, algorithm outputs block structure. swm, algorithm outputs estimates node's latent position.",4 "extending term subsumption systems uncertainty management. major difficulty developing maintaining large knowledge bases originates variety forms knowledge made available kb builder. objective research bring together two complementary knowledge representation schemes: term subsumption languages, represent reason defining characteristics concepts, proximate reasoning models, deal uncertain knowledge data expert systems. previous works area primarily focused probabilistic inheritance. paper, address two important issues regarding integration term subsumption-based systems approximate reasoning models. first, outline general architecture specifies interactions deductive reasoner term subsumption system approximate reasoner. second, generalize semantics terminological language terminological knowledge used make plausible inferences. architecture, combined generalized semantics, forms foundation synergistic tight integration term subsumption systems approximate reasoning models.",4 "stabilizing gan training multiple random projections. training generative adversarial networks unstable high-dimensions true data distribution lies lower-dimensional manifold. discriminator easily able separate nearly generated samples leaving generator without meaningful gradients. propose training single generator simultaneously array discriminators, looks different random low-dimensional projection data. show individual discriminators provide stable gradients generator, generator learns produce samples consistent full data distribution satisfy discriminators. demonstrate practical utility approach experimentally, show able produce image samples higher quality traditional training single discriminator.",4 "tutorial distributed (non-bayesian) learning: problem, algorithms results. overview results distributed learning focus family recently proposed algorithms known non-bayesian social learning. consider different approaches distributed learning problem algorithmic solutions case finitely many hypotheses. original centralized problem discussed first, followed generalization distributed setting. results convergence convergence rate presented asymptotic finite time regimes. various extensions discussed dealing directed time-varying networks, nesterov's acceleration technique continuum sets hypothesis.",12 "note tight lower bound mnl-bandit assortment selection models. note prove tight lower bound mnl-bandit assortment selection model matches upper bound given (agrawal et al., 2016a,b) parameters, logarithmic factors.",19 "magnifyme: aiding cross resolution face recognition via identity aware synthesis. enhancing low resolution images via super-resolution image synthesis cross-resolution face recognition well studied. several image processing machine learning paradigms explored addressing same. research, propose synthesis via deep sparse representation algorithm synthesizing high resolution face image low resolution input image. proposed algorithm learns multi-level sparse representation high low resolution gallery images, along identity aware dictionary transformation function two representations face identification scenarios. low resolution test data input, high resolution test image synthesized using identity aware dictionary transformation used face recognition. performance proposed sdsr algorithm evaluated four databases, including one real world dataset. experimental results comparison existing seven algorithms demonstrate efficacy proposed algorithm terms face identification image quality measures.",4 "natural language inference interaction space: iclr 2018 reproducibility report. tried reproduce results paper ""natural language inference interaction space"" submitted iclr 2018 conference part iclr 2018 reproducibility challenge. initially, aware code available, started implement network scratch. evaluated version model stanford nli dataset reached 86.38% accuracy test set, paper claims 88.0% accuracy. main difference, understand it, comes optimizers way model selection performed.",4 "general robust loss function. present two-parameter loss function viewed generalization many popular loss functions used robust statistics: cauchy/lorentzian, geman-mcclure, welsch/leclerc, generalized charbonnier loss functions (and transitivity l2, l1, l1-l2, pseudo-huber/charbonnier loss functions). penalty viewed negative log-likelihood, yields general probability distribution includes normal cauchy distributions special cases. describe visualize loss corresponding distribution, document several useful properties.",4 "heuristic method solving problem partitioning graphs supply demand. paper present greedy algorithm solving problem maximum partitioning graphs supply demand (mpgsd). goal method solve mpgsd large graphs reasonable time limit. done using two stage greedy algorithm, two corresponding types heuristics. solutions acquired way improved applying computationally inexpensive, hill climbing like, greedy correction procedure. numeric experiments analyze different heuristic functions stage greedy algorithm, show performance highly dependent properties specific instance. tests show exploring relatively small number solutions generated combining different heuristic functions, applying proposed correction procedure find solutions within percent optimal ones.",4 "searching one billion vectors: re-rank source coding. recent indexing techniques inspired source coding shown successful index billions high-dimensional vectors memory. paper, propose approach re-ranks neighbor hypotheses obtained compressed-domain indexing methods. contrast usual post-verification scheme, performs exact distance calculation short-list hypotheses, estimated distances refined based short quantization codes, avoid reading full vectors disk. released new public dataset one billion 128-dimensional vectors proposed experimental setup evaluate high dimensional indexing algorithms realistic scale. experiments show method accurately efficiently re-ranks neighbor hypotheses using little memory compared full vectors representation.",4 "joint cuts matching partitions one graph. two fundamental problems, graph cuts graph matching investigated decades, resulting vast literature two topics respectively. however way jointly applying solving graph cuts matching receives attention. paper, first formalize problem simultaneously cutting graph two partitions i.e. graph cuts establishing correspondence i.e. graph matching. develop optimization algorithm updating matching cutting alternatively, provided theoretical analysis. efficacy algorithm verified synthetic dataset real-world images containing similar regions structures.",4 "robust multimodal graph matching: sparse coding meets graph matching. graph matching challenging problem important applications wide range fields, image video analysis biological biomedical problems. propose robust graph matching algorithm inspired sparsity-related techniques. cast problem, resembling group collaborative sparsity formulations, non-smooth convex optimization problem efficiently solved using augmented lagrangian techniques. method deal weighted unweighted graphs, well multimodal data, different graphs represent different types data. proposed approach also naturally integrated collaborative graph inference techniques, solving general network inference problems observed variables, possibly coming different modalities, correspondence. algorithm tested compared state-of-the-art graph matching techniques synthetic real graphs. also present results multimodal graphs applications collaborative inference brain connectivity alignment-free functional magnetic resonance imaging (fmri) data. code publicly available.",12 "performance hybrid genetic algorithm dynamic environments. ability track optimum dynamic environments important many practical applications. paper, capability hybrid genetic algorithm (hga) track optimum dynamic environments investigated different functional dimensions, update frequencies, displacement strengths different types dynamic environments. experimental results reported using hga existing evolutionary algorithms literature. results show hga better capability track dynamic optimum existing algorithms.",4 "inductive representation learning large attributed graphs. graphs (networks) ubiquitous allow us model entities (nodes) dependencies (edges) them. learning useful feature representation graph data lies heart success many machine learning tasks classification, anomaly detection, link prediction, among many others. many existing techniques use random walks basis learning features estimating parameters graph model downstream prediction task. examples include recent node embedding methods deepwalk, node2vec, well graph-based deep learning algorithms. however, simple random walk used methods fundamentally tied identity node. three main disadvantages. first, approaches inherently transductive generalize unseen nodes graphs. second, space-efficient feature vector learned node impractical large graphs. third, approaches lack support attributed graphs. make methods generally applicable, propose framework inductive network representation learning based notion attributed random walk tied node identity instead based learning function $\phi : \mathrm{\rm \bf x} \rightarrow w$ maps node attribute vector $\mathrm{\rm \bf x}$ type $w$. framework serves basis generalizing existing methods deepwalk, node2vec, many previous methods leverage traditional random walks.",19 "ensemble distributed learners online classification dynamic data streams. present efficient distributed online learning scheme classify data captured distributed, heterogeneous, dynamic data sources. scheme consists multiple distributed local learners, analyze different streams data correlated common event needs classified. learner uses local classifier make local prediction. local predictions collected learner combined using weighted majority rule output final prediction. propose novel online ensemble learning algorithm update aggregation rule order adapt underlying data dynamics. rigorously determine bound worst case misclassification probability algorithm depends misclassification probabilities best static aggregation rule, best local classifier. importantly, worst case misclassification probability algorithm tends asymptotically 0 misclassification probability best static aggregation rule misclassification probability best local classifier tend 0. extend algorithm address challenges specific distributed implementation prove new bounds apply settings. finally, test scheme performing evaluation study several data sets. applied data sets widely used literature dealing dynamic data streams concept drift, scheme exhibits performance gains ranging 34% 71% respect state art solutions.",4 "kalman-filter restless bandits indexable?. study restless bandit associated extremely simple scalar kalman filter model discrete time. certain assumptions, prove problem indexable sense whittle index non-decreasing function relevant belief state. spite long history problem, appears first proof. use results schur-convexity mechanical words, particular binary strings intimately related palindromes.",19 "statistical knowledge bases degrees belief. intelligent agent often uncertain various properties environment, acting environment frequently need quantify uncertainty. example, agent wishes employ expected-utility paradigm decision theory guide actions, need assign degrees belief (subjective probabilities) various assertions. course, degrees belief arbitrary, rather based information available agent. paper describes one approach inducing degrees belief rich knowledge bases, include information particular individuals, statistical correlations, physical laws, default rules. call approach random-worlds method. method based principle indifference: treats worlds agent considers possible equally likely. able integrate qualitative default reasoning quantitative probabilistic reasoning providing language types information easily expressed. results show number desiderata arise direct inference (reasoning statistical information conclusions individuals) default reasoning follow directly {from} semantics random worlds. example, random worlds captures important patterns reasoning specificity, inheritance, indifference irrelevant information, default assumptions independence. furthermore, expressive power language used intuitive semantics random worlds allow method deal problems beyond scope many non-deductive reasoning systems.",4 "probabilistic reasoning via deep learning: neural association models. paper, propose new deep learning approach, called neural association model (nam), probabilistic reasoning artificial intelligence. propose use neural networks model association two events domain. neural networks take one event input compute conditional probability event model likely two events associated. actual meaning conditional probabilities varies applications depends models trained. work, two case studies, investigated two nam structures, namely deep neural networks (dnn) relation-modulated neural nets (rmnn), several probabilistic reasoning tasks ai, including recognizing textual entailment, triple classification multi-relational knowledge bases commonsense reasoning. experimental results several popular datasets derived wordnet, freebase conceptnet demonstrated dnns rmnns perform equally well significantly outperform conventional methods available reasoning tasks. moreover, compared dnns, rmnns superior knowledge transfer, pre-trained model quickly extended unseen relation observing training samples. prove effectiveness proposed models, work, applied nams solving challenging winograd schema (ws) problems. experiments conducted set ws problems prove proposed models potential commonsense reasoning.",4 "enhancing observability distribution grids using smart meter data. due limited metering infrastructure, distribution grids currently challenged observability issues. hand, smart meter data, including local voltage magnitudes power injections, communicated utility operator grid buses renewable generation demand-response programs. work employs grid data metered buses towards inferring underlying grid state. end, coupled formulation power flow problem (cpf) put forth. exploiting high variability injections metered buses, controllability solar inverters, relative time-invariance conventional loads, idea solve non-linear power flow equations jointly consecutive time instants. intuitive easily verifiable rule pertaining locations metered non-metered buses physical grid shown necessary sufficient criterion local observability radial networks. account noisy smart meter readings, coupled power system state estimation (cpsse) problem developed. cpf cpsse tasks tackled via augmented semi-definite program relaxations. observability criterion along cpf cpsse solvers numerically corroborated using synthetic actual solar generation load data ieee 34-bus benchmark feeder.",12 "bayesian network scoring metric based globally uniform parameter priors. introduce new bayesian network (bn) scoring metric called global uniform (gu) metric. metric based particular type default parameter prior. priors may useful bn developer willing able specify domain-specific parameter priors. gu parameter prior specifies every prior joint probability distribution p consistent bn structure considered equally likely. distribution p consistent p includes set independence relations defined s. show gu metric addresses undesirable behavior bdeu k2 bayesian network scoring metrics, also use particular forms default parameter priors. closed form formula computing gu special classes bns derived. efficiently computing gu arbitrary bn remains open problem.",4 "knapsack based optimal policies budget-limited multi-armed bandits. budget-limited multi-armed bandit (mab) problems, learner's actions costly constrained fixed budget. consequently, optimal exploitation policy may pull optimal arm repeatedly, case variants mab, rather pull sequence different arms maximises agent's total reward within budget. difference existing mabs means new approaches maximising total reward required. given this, develop two pulling policies, namely: (i) kube; (ii) fractional kube. whereas former provides better performance 40% experimental settings, latter computationally less expensive. also prove logarithmic upper bounds regret policies, show bounds asymptotically optimal (i.e. differ best possible regret constant factor).",4 "recurrent neural networks correct satellite image classification maps. initially devised image categorization, convolutional neural networks (cnns) increasingly used pixelwise semantic labeling images. however, proper nature common cnn architectures makes good recognizing poor localizing objects precisely. problem magnified context aerial satellite image labeling, spatially fine object outlining paramount importance. different iterative enhancement algorithms presented literature progressively improve coarse cnn outputs, seeking sharpen object boundaries around real image edges. however, one must carefully design, choose tune algorithms. instead, goal directly learn iterative process itself. this, formulate generic iterative enhancement process inspired partial differential equations, observe expressed recurrent neural network (rnn). consequently, train network manually labeled data enhancement task. series experiments show rnn effectively learns iterative process significantly improves quality satellite image classification maps.",4 "semantic foggy scene understanding synthetic data. work addresses problem semantic foggy scene understanding (sfsu). although extensive research performed image dehazing semantic scene understanding weather-clear images, little attention paid sfsu. due difficulty collecting annotating foggy images, choose generate synthetic fog real images depict weather-clear outdoor scenes, leverage synthetic data sfsu employing state-of-the-art convolutional neural networks (cnn). particular, complete pipeline generate synthetic fog real, weather-clear images using incomplete depth information developed. apply fog synthesis cityscapes dataset generate foggy cityscapes 20550 images. sfsu tackled two fashions: 1) typical supervised learning, 2) novel semi-supervised learning, combines 1) unsupervised supervision transfer weather-clear images synthetic foggy counterparts. addition, work carefully studies usefulness image dehazing sfsu. evaluation, present foggy driving, dataset 101 real-world images depicting foggy driving scenes, come ground truth annotations semantic segmentation object detection. extensive experiments show 1) supervised learning synthetic data significantly improves performance state-of-the-art cnn sfsu foggy driving; 2) semi-supervised learning strategy improves performance; 3) image dehazing marginally benefits sfsu learning strategy. datasets, models code made publicly available encourage research direction.",4 "transfer learning video recognition scarce training data deep convolutional neural network. unconstrained video recognition deep convolution network (dcn) two active topics computer vision recently. work, apply dcns frame-based recognizers video recognition. preliminary studies, however, show video corpora complete ground truth usually large diverse enough learn robust model. networks trained directly video data set suffer significant overfitting poor recognition rate test set. lack-of-training-sample problem limits usage deep models wide range computer vision problems obtaining training data difficult. overcome problem, perform transfer learning images videos utilize knowledge weakly labeled image corpus video recognition. image corpus help learn important visual patterns natural images, patterns ignored models trained video corpus. therefore, resultant networks better generalizability better recognition rate. show means transfer learning image video, learn frame-based recognizer 4k videos. image corpus weakly labeled, entire learning process requires 4k annotated instances, far less million scale image data sets required previous works. approach may applied visual recognition tasks scarce training data available, improves applicability dcns various computer vision problems. experiments also reveal correlation meta-parameters performance dcns, given properties target problem data. results lead heuristic meta-parameter selection future researches, rely time consuming meta-parameter search.",4 "applying supervised learning algorithms new feature selection method predict coronary artery disease. fresh data science perspective, thesis discusses prediction coronary artery disease based genetic variations dna base pair level, called single-nucleotide polymorphisms (snps), collected ontario heart genomics study (ohgs). first, thesis explains two commonly used supervised learning algorithms, k-nearest neighbour (k-nn) random forest classifiers, includes complete proof k-nn classifier universally consistent finite dimensional normed vector space. second, thesis introduces two dimensionality reduction steps, random projections, known feature extraction technique based johnson-lindenstrauss lemma, new method termed mass transportation distance (mtd) feature selection discrete domains. then, thesis compares performance random projections k-nn classifier mtd feature selection random forest, predicting artery disease based accuracy, f-measure, area receiver operating characteristic (roc) curve. comparative results demonstrate mtd feature selection random forest vastly superior random projections k-nn. random forest classifier able obtain accuracy 0.6660 area roc curve 0.8562 ohgs genetic dataset, 3335 snps selected mtd feature selection classification. area considerably better previous high score 0.608 obtained davies et al. 2010 dataset.",4 "replication study: development validation deep learning algorithm detection diabetic retinopathy retinal fundus photographs. replicated experiments 'development validation deep learning algorithm detection diabetic retinopathy retinal fundus photographs' published jama 2016; 316(22). re-implemented methods since source code available. original study used fundus images eyepacs three hospitals india training detection algorithm. used different eyepacs data set made available kaggle competition. evaluating algorithm's performance benchmark data set messidor-2 used. used similar messidor-original data set evaluate algorithm's performance. original study licensed ophthalmologists re-graded obtained images diabetic retinopathy, macular edema, image gradability. challenge re-implement methods publicly available data sets one diabetic retinopathy grade per image, find hyper-parameter settings training validation described original study, make assessment impact training ungradable images. able reproduce performance reported original study. believe model learn recognize lesions fundus images, since singular grade diabetic retinopathy per image, instead multiple grades per images. furthermore, original study missed details regarding hyper-parameter settings training validation. original study may also used image quality grades input training network. believe deep learning algorithms easily replicated, ideally source code published researchers confirm results experiments. source code instructions running replication available at: https://github.com/mikevoets/jama16-retina-replication.",4 "predictive business process monitoring lstm neural networks. predictive business process monitoring methods exploit logs completed cases process order make predictions running cases thereof. existing methods space tailor-made specific prediction tasks. moreover, relative accuracy highly sensitive dataset hand, thus requiring users engage trial-and-error tuning applying specific setting. paper investigates long short-term memory (lstm) neural networks approach build consistently accurate models wide range predictive process monitoring tasks. first, show lstms outperform existing techniques predict next event running case timestamp. next, show use models predicting next task order predict full continuation running case. finally, apply approach predict remaining time, show approach outperforms existing tailor-made methods.",19 "relative comparison kernel learning auxiliary kernels. work consider problem learning positive semidefinite kernel matrix relative comparisons form: ""object similar object b c"", comparisons given humans. existing solutions problem assume many comparisons provided learn high quality kernel. however, considered unrealistic many real-world tasks since relative assessments require human input, often costly difficult obtain. this, limited number comparisons may provided. work, explore methods aiding process learning kernel help auxiliary kernels built easily extractable information regarding relationships among objects. propose new kernel learning approach target kernel defined conic combination auxiliary kernels kernel whose elements learned directly. formulate convex optimization solve target kernel adds minor overhead methods use auxiliary information. empirical results show presence training relative comparisons, method learn kernels generalize out-of-sample comparisons methods utilize auxiliary information, well similar methods learn metrics objects.",4 "boost power viola-jones face detector using pre-processing? empirical study. viola-jones face detection algorithm (and still is) quite popular face detector. spite numerous face detection techniques recently presented, many research works still based viola-jones algorithm simplicity. paper, study influence set blind pre-processing methods face detection rate using viola-jones algorithm. focus two aspects improvement, specifically badly illuminated faces blurred faces. many methods lighting invariant deblurring used order improve detection accuracy. want avoid using blind pre-processing methods may obstruct face detector. end, perform two sets experiments. first set performed avoid blind pre-processing method may hurt face detector. second set performed study effect selected pre-processing methods images suffer hard conditions. present two manners applying pre-processing method image prior used viola-jones face detector. four different datasets used draw coherent conclusion potential improvement caused using prior enhanced images. results demonstrate pre-processing methods may hurt accuracy viola-jones face detection algorithm. however, pre-processing methods evident positive impact accuracy face detector. overall, recommend three simple fast blind photometric normalization methods pre-processing step order improve accuracy pre-trained viola-jones face detector.",4 "variational dropout local reparameterization trick. investigate local reparameterizaton technique greatly reducing variance stochastic gradients variational bayesian inference (sgvb) posterior model parameters, retaining parallelizability. local reparameterization translates uncertainty global parameters local noise independent across datapoints minibatch. parameterizations trivially parallelized variance inversely proportional minibatch size, generally leading much faster convergence. additionally, explore connection dropout: gaussian dropout objectives correspond sgvb local reparameterization, scale-invariant prior proportionally fixed posterior variance. method allows inference flexibly parameterized posteriors; specifically, propose variational dropout, generalization gaussian dropout dropout rates learned, often leading better models. method demonstrated several experiments.",19 "adaptive substring extraction modified local nbnn scoring binary feature-based local mobile visual search without false positives. paper, propose stand-alone mobile visual search system based binary features bag-of-visual words framework. contribution study three-fold: (1) propose adaptive substring extraction method adaptively extracts informative bits original binary vector stores inverted index. substrings used refine visual word-based matching. (2) modified local nbnn scoring method proposed context image retrieval, considers density binary features scoring feature matching. (3) order suppress false positives, introduce convexity check step imposes convexity constraint configuration transformed reference image. proposed system improves retrieval accuracy 11% compared conventional method without increasing database size. furthermore, system convexity check lead false positive results.",4 "auxiliary objectives neural error detection models. investigate utility different auxiliary objectives training strategies within neural sequence labeling approach error detection learner writing. auxiliary costs provide model additional linguistic information, allowing learn general-purpose compositional features exploited objectives. experiments show joint learning approach trained parallel labels in-domain data improves performance previous best error detection system. resulting model number parameters, additional objectives allow optimised efficiently achieve better performance.",4 "metric learning pairwise kernel graph inference. much recent work bioinformatics focused inference various types biological networks, representing gene regulation, metabolic processes, protein-protein interactions, etc. common setting involves inferring network edges supervised fashion set high-confidence edges, possibly characterized multiple, heterogeneous data sets (protein sequence, gene expression, etc.). here, distinguish two modes inference setting: direct inference based upon similarities nodes joined edge, indirect inference based upon similarities one pair nodes another pair nodes. propose supervised approach direct case translating distance metric learning problem. relaxation resulting convex optimization problem leads support vector machine (svm) algorithm particular kernel pairs, call metric learning pairwise kernel (mlpk). demonstrate, using several real biological networks, direct approach often improves upon state-of-the-art svm indirect inference tensor product pairwise kernel.",16 "cluster approach domains formation. rule, quadratic functional depending great number binary variables lot local minima. one approaches allowing one find averaged deeper local minima aggregation binary variables larger blocks/domains. minimize functional one change states aggregated variables (domains). present publication discuss methods domains formation. shown best results obtained domains formed variables strongly connected other.",4 "verb semantics lexical selection. paper focus semantic representation verbs computer systems impact lexical selection problems machine translation (mt). two groups english chinese verbs examined show lexical selection must based interpretation sentence well selection restrictions placed verb arguments. novel representation scheme suggested, compared representations selection restrictions used transfer-based mt. see approach closely aligned knowledge-based mt approaches (kbmt), separate component could incorporated existing systems. examples experimental results show that, using scheme, inexact matches achieve correct lexical selection.",2 "feature extraction using latent dirichlet allocation neural networks: case study movie synopses. feature extraction gained increasing attention field machine learning, order detect patterns, extract information, predict future observations big data, urge informative features crucial. process extracting features highly linked dimensionality reduction implies transformation data sparse high-dimensional space, higher level meaningful abstractions. dissertation employs neural networks distributed paragraph representations, latent dirichlet allocation capture higher level features paragraph vectors. although neural networks distributed paragraph representations considered state art extracting paragraph vectors, show quick topic analysis model latent dirichlet allocation provide meaningful features too. evaluate two methods cmu movie summary corpus, collection 25,203 movie plot summaries extracted wikipedia. finally, approaches, use k-nearest neighbors discover similar movies, plot projected representations using t-distributed stochastic neighbor embedding depict context similarities. similarities, expressed movie distances, used movies recommendation. recommended movies approach compared recommended movies imdb, use collaborative filtering recommendation approach, show two models could constitute either alternative supplementary recommendation approach.",4 "effective object tracking unstructured crowd scenes. paper, presenting rotation variant oriented texture curve (otc) descriptor based mean shift algorithm tracking object unstructured crowd scene. proposed algorithm works first obtaining otc features manually selected object target, visual vocabulary created using otc features target. target histogram obtained using codebook encoding method used mean shift framework perform similarity search. results obtained different videos challenging scenes comparison proposed approach several state-of-the-art approaches provided. analysis shows advantages limitations proposed approach tracking object unstructured crowd scenes.",4 "towards full automated drive urban environments: demonstration gomentum station, california. year, millions motor vehicle traffic accidents world cause large number fatalities, injuries significant material loss. automated driving (ad) potential drastically reduce accidents. work, focus technical challenges arise ad urban environments. present overall architecture ad system describe detail perception planning modules. ad system, built modified acura rlx, demonstrated course gomentum station california. demonstrated autonomous handling 4 scenarios: traffic lights, cross-traffic intersections, construction zones pedestrians. ad vehicle displayed safe behavior performed consistently repeated demonstrations slight variations conditions. overall, completed 44 runs, encompassing 110km automated driving 3 cases driver intervened control vehicle, mostly due error gps positioning. demonstration showed robust consistent behavior urban scenarios possible, yet investigation necessary full scale roll-out public roads.",4 "unconstrained fashion landmark detection via hierarchical recurrent transformer networks. fashion landmarks functional key points defined clothes, corners neckline, hemline, cuff. recently introduced effective visual representation fashion image understanding. however, detecting fashion landmarks challenging due background clutters, human poses, scales. remove variations, previous works usually assumed bounding boxes clothes provided training test additional annotations, expensive obtain inapplicable practice. work addresses unconstrained fashion landmark detection, clothing bounding boxes provided training test. end, present novel deep landmark network (dlan), bounding boxes landmarks jointly estimated trained iteratively end-to-end manner. dlan contains two dedicated modules, including selective dilated convolution handling scale discrepancies, hierarchical recurrent spatial transformer handling background clutters. evaluate dlan, present large-scale fashion landmark dataset, namely unconstrained landmark database (uld), consisting 30k images. statistics show uld challenging existing datasets terms image scales, background clutters, human poses. extensive experiments demonstrate effectiveness dlan state-of-the-art methods. dlan also exhibits excellent generalization across different clothing categories modalities, making extremely suitable real-world fashion analysis.",4 "mixture cox-logistic model feature selection survival classification data. paper presents original approach jointly fitting survival times classifying samples subgroups. coxlogit model generalized linear model common set selected features tasks. survival times class labels assumed conditioned common risk score depends features. learning naturally expressed maximizing joint probability subgroup labels ordering survival events, conditioned common weight vector. model estimated minimizing regularized log-likelihood coordinate descent algorithm. validation synthetic breast cancer data shows proposed approach outperforms standard cox model logistic regression predicting survival times classifying new samples subgroups. also better selecting informative features tasks.",19 "efficient practical stochastic subgradient descent nuclear norm regularization. describe novel subgradient methods broad class matrix optimization problems involving nuclear norm regularization. unlike existing approaches, method executes cheap iterations combining low-rank stochastic subgradients efficient incremental svd updates, made possible highly optimized parallelizable dense linear algebra operations small matrices. practical algorithms always maintain low-rank factorization iterates conveniently held memory efficiently multiplied generate predictions matrix completion settings. empirical comparisons confirm approach highly competitive several recently proposed state-of-the-art solvers problems.",4 "experimental comparison single-pixel imaging algorithms. single-pixel imaging (spi) novel technique capturing 2d images using photodiode, instead conventional 2d array sensors. spi owns high signal-to-noise ratio, wide spectrum range, low cost, robustness light scattering. various algorithms proposed spi reconstruction, including linear correlation methods, alternating projection method (ap), compressive sensing based methods. however, comprehensive review discussing respective advantages, important spi's applications development. paper, reviewed compared algorithms unified reconstruction framework. besides, proposed two spi algorithms including conjugate gradient descent based method (cgd) poisson maximum likelihood based method. simulations experiments validate following conclusions: obtain comparable reconstruction accuracy, compressive sensing based total variation regularization method (tv) requires least measurements consumes least running time small-scale reconstruction; cgd ap methods run fastest large-scale cases; tv ap methods robust measurement noise. word, trade-offs capture efficiency, computational complexity robustness noise among different spi algorithms. released source code non-commercial use.",4 "parallel implementation efficient search schemes inference cancer progression models. emergence development cancer consequence accumulation time genomic mutations involving specific set genes, provides cancer clones functional selective advantage. work, model order accumulation mutations progression, eventually leads disease, means probabilistic graphic models, i.e., bayesian networks (bns). investigate perform task learning structure bns, according experimental evidence, adopting global optimization meta-heuristics. particular, work rely genetic algorithms, strongly reduce execution time inference -- also involve multiple repetitions collect statistically significant assessments data -- distribute calculations using multi-threading multi-node architecture. results show approach characterized good accuracy specificity; also demonstrate feasibility, thanks 84x reduction overall execution time respect traditional sequential implementation.",4 "machine learning indoor localization using mobile phone-based sensors. paper investigate problem localizing mobile device based readings embedded sensors utilizing machine learning methodologies. consider real-world environment, collect large dataset 3110 datapoints, examine performance substantial number machine learning algorithms localizing mobile device. found algorithms give mean error accurate 0.76 meters, outperforming indoor localization systems reported literature. also propose hybrid instance-based approach results speed increase factor ten loss accuracy live deployment standard instance-based methods, allowing fast accurate localization. further, determine smaller datasets collected less density affect accuracy localization, important use real-world environments. finally, demonstrate approaches appropriate real-world deployment evaluating performance online, in-motion experiment.",4 "convolutional analysis operator learning: acceleration, convergence, application, neural networks. convolutional operator learning increasingly gaining attention many signal processing computer vision applications. learning kernels mostly relied so-called local approaches extract store many overlapping patches across training signals. due memory demands, local approaches limitations learning kernels large datasets -- particularly multi-layered structures, e.g., convolutional neural network (cnn) -- and/or applying learned kernels high-dimensional signal recovery problems. so-called global approach studied within ""synthesis"" signal model, e.g., convolutional dictionary learning, overcoming memory problems careful algorithmic designs. paper proposes new convolutional analysis operator learning (caol) framework global approach, develops new convergent block proximal gradient method using majorizer (bpg-m) solve corresponding block multi-nonconvex problems. learn diverse filters within caol framework, paper introduces orthogonality constraint enforces tight-frame (tf) filter condition, regularizer promotes diversity filters. numerical experiments show that, tight majorizers, bpg-m significantly accelerates caol convergence rate compared state-of-the-art method, bpg. numerical experiments sparse-view computational tomography show caol using tf filters significantly improves reconstruction quality compared conventional edge-preserving regularizer. finally, paper shows caol useful mathematically model cnn, corresponding updates obtained via bpg-m coincide core modules cnn.",19 "support vector machines/relevance vector machine remote sensing classification: review. kernel-based machine learning algorithms based mapping data original input feature space kernel feature space higher dimensionality solve linear problem space. last decade, kernel based classification regression approaches support vector machines widely used remote sensing well various civil engineering applications. spite better performance different datasets, support vector machines still suffer shortcomings visualization/interpretation model, choice kernel kernel specific parameter well regularization parameter. relevance vector machines another kernel based approach explored classification regression last years. advantages relevance vector machines support vector machines availability probabilistic predictions, using arbitrary kernel functions requiring setting regularization parameter. paper presents state-of-the-art review svm rvm remote sensing provides details use civil engineering application also.",4 "efficient rectangular maximal-volume algorithm rating elicitation collaborative filtering. cold start problem collaborative filtering solved asking new users rate small seed set representative items asking representative users rate new item. question build seed set give enough preference information making good recommendations. one successful approaches, called representative based matrix factorization, based maxvol algorithm. unfortunately, approach one important limitation --- seed set particular size requires rating matrix factorization fixed rank coincide size. necessarily optimal general case. current paper, introduce fast algorithm analytical generalization approach call rectangular maxvol. allows rank factorization lower required size seed set. moreover, paper includes theoretical analysis method's error, complexity analysis existing methods comparison state-of-the-art approaches.",4 "electre tri-machine learning approach record linkage problem. short paper, electre tri-machine learning method, generally used solve ordinal classification problems, proposed solving record linkage problem. preliminary experimental results show that, using electre tri method, high accuracy achieved 99% matches nonmatches correctly identified procedure.",19 "discovering emerging topics social streams via link anomaly detection. detection emerging topics receiving renewed interest motivated rapid growth social networks. conventional term-frequency-based approaches may appropriate context, information exchanged texts also images, urls, videos. focus social aspects theses networks. is, links users generated dynamically intentionally unintentionally replies, mentions, retweets. propose probability model mentioning behaviour social network user, propose detect emergence new topic anomaly measured model. combine proposed mention anomaly score recently proposed change-point detection technique based sequentially discounting normalized maximum likelihood (sdnml), kleinberg's burst model. aggregating anomaly scores hundreds users, show detect emerging topics based reply/mention relationships social network posts. demonstrate technique number real data sets gathered twitter. experiments show proposed mention-anomaly-based approaches detect new topics least early conventional term-frequency-based approach, sometimes much earlier keyword ill-defined.",19 "twitter sentiment analysis: lexicon method, machine learning method combination. paper covers two approaches sentiment analysis: i) lexicon based method; ii) machine learning method. describe several techniques implement approaches discuss adopted sentiment classification twitter messages. present comparative study different lexicon combinations show enhancing sentiment lexicons emoticons, abbreviations social-media slang expressions increases accuracy lexicon-based classification twitter. discuss importance feature generation feature selection processes machine learning sentiment classification. quantify performance main sentiment analysis methods twitter run algorithms benchmark twitter dataset semeval-2013 competition, task 2-b. results show machine learning method based svm naive bayes classifiers outperforms lexicon method. present new ensemble method uses lexicon based sentiment score input feature machine learning approach. combined method proved produce precise classifications. also show employing cost-sensitive classifier highly unbalanced datasets yields improvement sentiment classification performance 7%.",4 "bayesian conditional gaussian network classifiers applications mass spectra classification. classifiers based probabilistic graphical models effective. continuous domains, maximum likelihood usually used assess predictions classifiers. data scarce, easily lead overfitting. probabilistic setting, bayesian averaging (ba) provides theoretically optimal predictions known robust overfitting. work introduce bayesian conditional gaussian network classifiers, efficiently perform exact bayesian averaging parameters. evaluate proposed classifiers maximum likelihood alternatives proposed far standard uci datasets, concluding performing ba improves quality assessed probabilities (conditional log likelihood) whilst maintaining error rate. overfitting likely occur domains number data items small number variables large. two conditions met realm bioinformatics, early diagnosis cancer mass spectra relevant task. provide application classification framework problem, comparing standard maximum likelihood alternative, improvement quality assessed probabilities confirmed.",4 using machine learning medium frequency derivative portfolio trading. use machine learning designing medium frequency trading strategy portfolio 5 year 10 year us treasury note futures. formulate classification problem predict weekly direction movement portfolio using features extracted deep belief network trained technical indicators portfolio constituents. experimentation shows resulting pipeline effective making profitable trade.,17 "shapley value solution game theoretic-based feature reduction false alarm detection. false alarm one main concerns intensive care units result care disruption, sleep deprivation, insensitivity care-givers alarms. several methods proposed suppress false alarm rate improving quality physiological signals filtering, developing accurate sensors. however, significant intrinsic correlation among extracted features limits performance currently available data mining techniques, often discard predictors low individual impact may potentially strong discriminatory power grouped others. propose model based coalition game theory considers inter-features dependencies determining salient predictors respect false alarm, results improved classification accuracy. superior performance method compared current methods shown simulation results using physionnet's mimic ii database.",4 "using description logics rdf constraint checking closed-world recognition. rdf description logics work open-world setting absence information information absence. nevertheless, description logic axioms interpreted closed-world setting setting used constraint checking closed-world recognition information sources. information sources expressed well-behaved rdf rdfs (i.e., rdf graphs interpreted rdf rdfs semantics) constraint checking closed-world recognition simple describe. constraint checking implemented sparql querying thus effectively performed.",4 "improving recall situ sequencing self-learned features graphical model. image-based sequencing mrna makes possible see tissue sample given gene active, thus discern large numbers different cell types parallel. crucial gaining better understanding tissue development disease cancer. signals collected multiple staining imaging cycles, signal density together noise makes signal decoding challenging. previous approaches led low signal recall efforts maintain high sensitivity. propose approach signal candidates generously included, true-signal probability cycle level self-learned using convolutional neural network. signal candidates probability predictions thereafter fed graphical model searching signal candidates across sequencing cycles. graphical model combines intensity, probability spatial distance find optimal paths representing decoded signal sequences. evaluate approach relation state-of-the-art, show increase recall $27\%$ maintained sensitivity. furthermore, visual examination shows correctly resolved signals previously lost due high signal density. thus, proposed approach potential significantly improve analysis spatial statistics situ sequencing experiments.",16 "rule-based semantic tagging. application undergoing dictionary glosses. project presented article aims formalize criteria procedures order extract semantic information parsed dictionary glosses. actual purpose project generation semantic network (nearly ontology) issued monolingual italian dictionary, unsupervised procedures. since project involves rule-based parsing, semantic tagging word sense disambiguation techniques, outcomes may find interest also beyond immediate intent. cooperation syntactic semantic features meaning construction investigated, procedures allows translation syntactic dependencies semantic relations discussed. procedures rise project applied also text types dictionary glosses, convert output parsing process semantic representation. addition mechanism sketched may lead kind procedural semantics, multiple paraphrases given expression generated. means techniques may find application also 'query expansion' strategies, interesting information retrieval, search engines question answering systems.",4 "masked conditional neural networks automatic sound events recognition. deep neural network architectures designed application domains sound, especially image recognition, may optimally harness time-frequency representation adapted sound recognition problem. work, explore conditional neural network (clnn) masked conditional neural network (mclnn) multi-dimensional temporal signal recognition. clnn considers inter-frame relationship, mclnn enforces systematic sparseness network's links enable learning frequency bands rather bins allowing network frequency shift invariant mimicking filterbank. mask also allows considering several combinations features concurrently, usually handcrafted exhaustive manual search. applied mclnn environmental sound recognition problem using esc-10 esc-50 datasets. mclnn achieved competitive performance, using 12% parameters without augmentation, compared state-of-the-art convolutional neural networks.",4 "efficient gradient estimation motor control learning. task estimating gradient function presence noise central several forms reinforcement learning, including policy search methods. present two techniques reducing gradient estimation errors presence observable input noise applied control signal. first method extends idea reinforcement baseline fitting local linear model function whose gradient estimated; show find linear model minimizes variance gradient estimate, estimate model data. second method improves discounting components gradient vector high variance. methods applied problem motor control learning, actuator noise significant influence behavior. particular, apply techniques learn locally optimal controllers dart-throwing task using simulated three-link arm; demonstrate proposed methods significantly improve reward function gradient estimate and, consequently, learning curve, existing methods.",4 "learned versus hand-designed feature representations 3d agglomeration. image recognition labeling tasks, recent results suggest machine learning methods rely manually specified feature representations may outperformed methods automatically derive feature representations based data. yet problems involve analysis 3d objects, mesh segmentation, shape retrieval, neuron fragment agglomeration, remains strong reliance hand-designed feature descriptors. paper, evaluate large set hand-designed 3d feature descriptors alongside features learned raw data using end-to-end unsupervised learning techniques, context agglomeration 3d neuron fragments. combining unsupervised learning techniques novel dynamic pooling scheme, show pure learning-based methods first time competitive hand-designed 3d shape descriptors. investigate data augmentation strategies dramatically increasing size training set, show combining learned hand-designed features leads highest accuracy.",4 "toolnet: holistically-nested real-time segmentation robotic surgical tools. real-time tool segmentation endoscopic videos essential part many computer-assisted robotic surgical systems critical importance robotic surgical data science. propose two novel deep learning architectures automatic segmentation non-rigid surgical instruments. methods take advantage automated deep-learning-based multi-scale feature extraction trying maintain accurate segmentation quality resolutions. two proposed methods encode multi-scale constraint inside network architecture. first proposed architecture enforces cascaded aggregation predictions second proposed network means holistically-nested architecture loss scale taken account optimization process. proposed methods real-time semantic labeling, present reduced number parameters. propose use parametric rectified linear units semantic labeling small architectures increase regularization ability design maintain segmentation accuracy without overfitting training sets. compare proposed architectures state-of-the-art fully convolutional networks. validate methods using existing benchmark datasets, including ex vivo cases phantom tissue different robotic surgical instruments present scene. results show statistically significant improved dice similarity coefficient previous instrument segmentation methods. analyze design choices discuss key drivers improving accuracy.",4 adopting robustness optimality fitting learning. generalized modified exponentialized estimator pushing robust-optimal (ro) index $\lambda$ $-\infty$ achieving robustness outliers optimizing quasi-minimin function. robustness realized controlled adaptively ro index without predefined threshold. optimality guaranteed expansion convexity region hessian matrix largely avoid local optima. detailed quantitative analysis robustness optimality provided. results proposed experiments fitting tasks three noisy non-convex functions digits recognition task mnist dataset consolidate conclusions.,4 "norm matters: efficient accurate normalization schemes deep networks. past years batch-normalization commonly used deep networks, allowing faster training high performance wide variety applications. however, reasons behind merits remained unanswered, several shortcomings hindered use certain tasks. work present novel view purpose function normalization methods weight-decay, tools decouple weights' norm underlying optimized objective. also improve use weight-normalization show connection practices normalization, weight decay learning-rate adjustments. finally, suggest several alternatives widely used $l^2$ batch-norm, using normalization $l^1$ $l^\infty$ spaces substantially improve numerical stability low-precision implementations well provide computational memory benefits. demonstrate methods enable first batch-norm alternative work half-precision implementations.",19 "matching-cnn meets knn: quasi-parametric human parsing. parametric non-parametric approaches demonstrated encouraging performances human parsing task, namely segmenting human image several semantic regions (e.g., hat, bag, left arm, face). work, aim develop new solution advantages methodologies, namely supervision annotated data flexibility use newly annotated (possibly uncommon) images, present quasi-parametric human parsing model. classic k nearest neighbor (knn)-based nonparametric framework, parametric matching convolutional neural network (m-cnn) proposed predict matching confidence displacements best matched region testing image particular semantic region one knn image. given testing image, first retrieve knn images annotated/manually-parsed human image corpus. semantic region knn image matched confidence testing image using m-cnn, matched regions knn images fused, followed superpixel smoothing procedure obtain ultimate human parsing result. m-cnn differs classic cnn tailored cross image matching filters introduced characterize matching testing image semantic region knn image. cross image matching filters defined different convolutional layers, aiming capture particular range displacements. comprehensive evaluations large dataset 7,700 annotated human images well demonstrate significant performance gain quasi-parametric model state-of-the-arts, human parsing task.",4 "nonparametric bayesian approach toward stacked convolutional independent component analysis. unsupervised feature learning algorithms based convolutional formulations independent components analysis (ica) demonstrated yield state-of-the-art results several action recognition benchmarks. however, existing approaches allow number latent components (features) automatically inferred data unsupervised manner. significant disadvantage state-of-the-art, results considerable burden imposed researchers practitioners, must resort tedious cross-validation procedures obtain optimal number latent features. resolve issues, paper introduce convolutional nonparametric bayesian sparse ica architecture overcomplete feature learning high-dimensional data. method utilizes indian buffet process prior facilitate inference appropriate number latent features hybrid variational inference algorithm, scalable massive datasets. show, model naturally used obtain deep unsupervised hierarchical feature extractors, greedily stacking successive model layers, similar existing approaches. addition, inference model completely heuristics-free; thus, obviates need tedious parameter tuning, major challenge deep learning approaches faced with. evaluate method several action recognition benchmarks, exhibit advantages state-of-the-art.",4 invertibility robustness phaseless reconstruction. paper concerned question reconstructing vector finite-dimensional real hilbert space magnitudes coefficients vector redundant linear map known. analyze various lipschitz bounds nonlinear analysis map establish theoretical performance bounds reconstruction algorithm. show robust stable reconstruction requires additional redundancy critical threshold.,12 "adaptive objectness object tracking. object tracking long standing problem vision. great efforts spent improve tracking performance, simple yet reliable prior knowledge left unexploited: target object tracking must object non-object. recently proposed popularized objectness measure provides natural way model prior visual tracking. thus motivated, paper propose adapt objectness visual object tracking. instead directly applying existing objectness measure generic handles various objects environments, adapt compatible specific tracking sequence object. specifically, use newly proposed bing objectness base, train object-adaptive objectness tracking task. training implemented using adaptive support vector machine integrates information specific tracking target bing measure. emphasize benefit proposed adaptive objectness, named adobing, generic. show this, combine adobing seven top performed trackers recent evaluations. run adobing-enhanced trackers base trackers two popular benchmarks, cvpr2013 benchmark (50 sequences) princeton tracking benchmark (100 sequences). benchmarks, methods consistently improve base trackers, also achieve best known performances. noting way integrate objectness visual tracking generic straightforward, expect even improvement using tracker-specific objectness.",4 "robust 3d-2d interactive tool scene segmentation annotation. recent advances 3d acquisition devices enabled large-scale acquisition 3d scene data. data, completely well annotated, serve useful ingredients wide spectrum computer vision graphics works data-driven modeling scene understanding, object detection recognition. however, annotating vast amount 3d scene data remains challenging due lack effective tool and/or complexity 3d scenes (e.g. clutter, varying illumination conditions). paper aims build robust annotation tool effectively conveniently enables segmentation annotation massive 3d data. tool works coupling 2d 3d information via interactive framework, users provide high-level semantic annotation objects. experimented tool found typical indoor scene could well segmented annotated less 30 minutes using tool, opposed hours done manually. along tool, created dataset hundred 3d scenes associated complete annotations using tool. tool dataset available www.scenenn.net.",4 "quality estimation machine translation outputs stemming. machine translation challenging problem indian languages. every day see machine translators developed, getting high quality automatic translation still distant dream . correct translated sentence hindi language rarely found. paper, emphasizing english-hindi language pair, order preserve correct mt output present ranking system, employs machine learning techniques morphological features. ranking human intervention required. also validated results comparing human ranking.",4 "bda-pch: block-diagonal approximation positive-curvature hessian training neural networks. propose block-diagonal approximation positive-curvature hessian (bda-pch) matrix measure curvature. proposed bdapch matrix memory efficient applied fully-connected neural networks activation criterion functions twice differentiable. particularly, bda-pch matrix handle non-convex criterion functions. devise efficient scheme utilizing conjugate gradient method derive newton directions mini-batch setting. empirical studies show method outperforms competing second-order methods convergence speed.",4 "concept ""altruism"" sociological research: conceptualization operationalization. article addresses question relevant conceptualization {\guillemotleft}altruism{\guillemotright} russian perspective sociological research operationalization. investigates spheres social application word {\guillemotleft}altruism{\guillemotright}, include russian equivalent {\guillemotleft}vzaimopomoshh`{\guillemotright} (mutual help). data study comes russian national corpus (russian). theoretical framework consists paul f. lazarsfeld`s theory sociological research methodology natural semantic metalanguage (nsm). quantitative analysis shows features representation altruism russian sociologists need know preparation questionnaires, interview guides analysis transcripts.",4 "stochastic gradient method accelerated stochastic dynamics. paper, propose novel technique implement stochastic gradient methods, beneficial learning large datasets, accelerated stochastic dynamics. stochastic gradient method based mini-batch learning reducing computational cost amount data large. stochasticity gradient mitigated injection gaussian noise, yields stochastic langevin gradient method; method used bayesian posterior sampling. however, performance stochastic langevin gradient method depends mixing rate stochastic dynamics. study, propose violating detailed balance condition enhance mixing rate. recent studies revealed violating detailed balance condition accelerates convergence stationary state reduces correlation time samplings. implement violation detailed balance condition stochastic gradient langevin method test method simple model demonstrate performance.",19 "bipolar possibilistic representations. recently, emphasized possibility theory framework allows us distinguish i) possible ruled available knowledge, ii) possible sure. distinction may useful representing knowledge, modelling values impossible consistent available knowledge one hand, values guaranteed possible reported observations hand. also interest expressing preferences, point values positively desired among rejected. distinction encoded two types constraints expressed terms necessity measures terms guaranteed possibility functions, induce pair possibility distributions semantic level. consistency condition ensure claimed guaranteed possible indeed impossible. present paper investigates representation bipolar view, including case stated means conditional measures, means comparative context-dependent constraints. interest bipolar framework, recently stressed expressing preferences, also pointed representation diagnostic knowledge.",4 "statistical constraints. introduce statistical constraints, declarative modelling tool links statistics constraint programming. discuss two statistical constraints associated filtering algorithms. finally, illustrate applications standard problems encountered statistics novel inspection scheduling problem aim find inspection plans desirable statistical properties.",4 "pose activity: surveying datasets introducing converse. present review current state publicly available datasets within human action recognition community; highlighting revival pose based methods recent progress understanding person-person interaction modeling. categorize datasets regarding several key properties usage benchmark dataset; including number class labels, ground truths provided, application domain occupy. also consider level abstraction dataset; grouping present actions, interactions higher level semantic activities. survey identifies key appearance pose based datasets, noting tendency simplistic, emphasized, scripted action classes often readily definable stable collection sub-action gestures. clear lack datasets provide closely related actions, implicitly identified via series poses gestures, rather dynamic set interactions. therefore propose novel dataset represents complex conversational interactions two individuals via 3d pose. 8 pairwise interactions describing 7 separate conversation based scenarios collected using two kinect depth sensors. intention provide events constructed numerous primitive actions, interactions motions, period time; providing set subtle action classes representative real world, challenge currently developed recognition methodologies. believe among one first datasets devoted conversational interaction classification using 3d pose features attributed papers show task indeed possible. full dataset made publicly available research community www.csvision.swansea.ac.uk/converse.",4 "non-uniform feature sampling decision tree ensembles. study effectiveness non-uniform randomized feature selection decision tree classification. experimentally evaluate two feature selection methodologies, based information extracted provided dataset: $(i)$ \emph{leverage scores-based} $(ii)$ \emph{norm-based} feature selection. experimental evaluation proposed feature selection techniques indicate approaches might effective compared naive uniform feature selection moreover comparable performance random forest algorithm [3]",19 "divide, denoise, defend adversarial attacks. deep neural networks, although shown successful class machine learning algorithms, known extremely unstable adversarial perturbations. improving robustness neural networks attacks important, especially security-critical applications. defend attacks, propose dividing input image multiple patches, denoising patch independently, reconstructing image, without losing significant image content. proposed defense mechanism non-differentiable makes non-trivial adversary apply gradient-based attacks. moreover, fine-tune network adversarial examples, making robust unknown attacks. present thorough analysis tradeoff accuracy robustness adversarial attacks. evaluate method black-box, grey-box, white-box settings. proposed method outperforms state-of-the-art significant margin imagenet dataset grey-box attacks maintaining good accuracy clean images. also establish strong baseline novel white-box attack.",4 "stochastic frank-wolfe methods nonconvex optimization. study frank-wolfe methods nonconvex stochastic finite-sum optimization problems. frank-wolfe methods (in convex case) gained tremendous recent interest machine learning optimization communities due projection-free property ability exploit structured constraints. however, understanding algorithms nonconvex setting fairly limited. paper, propose nonconvex stochastic frank-wolfe methods analyze convergence properties. objective functions decompose finite-sum, leverage ideas variance reduction techniques convex optimization obtain new variance reduced nonconvex frank-wolfe methods provably faster convergence classical frank-wolfe method. finally, show faster convergence rates variance reduced methods also translate improved convergence rates stochastic setting.",12 "entity embeddings categorical variables. map categorical variables function approximation problem euclidean spaces, entity embeddings categorical variables. mapping learned neural network standard supervised training process. entity embedding reduces memory usage speeds neural networks compared one-hot encoding, importantly mapping similar values close embedding space reveals intrinsic properties categorical variables. applied successfully recent kaggle competition able reach third position relative simple features. demonstrate paper entity embedding helps neural network generalize better data sparse statistics unknown. thus especially useful datasets lots high cardinality features, methods tend overfit. also demonstrate embeddings obtained trained neural network boost performance tested machine learning methods considerably used input features instead. entity embedding defines distance measure categorical variables used visualizing categorical data data clustering.",4 "learning augmented features heterogeneous domain adaptation. propose new learning method heterogeneous domain adaptation (hda), data source domain target domain represented heterogeneous features different dimensions. using two different projection matrices, first transform data two domains common subspace order measure similarity data two domains. propose two new feature mapping functions augment transformed data original features zeros. existing learning methods (e.g., svm svr) readily incorporated newly proposed augmented feature representations effectively utilize data domains hda. using hinge loss function svm example, introduce detailed objective function method called heterogeneous feature augmentation (hfa) linear case also describe kernelization order efficiently cope data high dimensions. moreover, also develop alternating optimization algorithm effectively solve nontrivial optimization problem hfa method. comprehensive experiments two benchmark datasets clearly demonstrate hfa outperforms existing hda methods.",4 "contained neural style transfer decorated logo generation. making decorated logos requires image editing skills, without sufficient skills, could time-consuming task. many on-line web services make new logos, limited designs duplicates made. propose using neural style transfer clip art text creation new genuine logos. introduce new loss function based distance transform input image, allows preservation silhouettes text objects. proposed method contains style transfer designated area. demonstrate characteristics proposed method. finally, show results logo generation various input images.",4 "queue-aware distributive resource control delay-sensitive two-hop mimo cooperative systems. paper, consider queue-aware distributive resource control algorithm two-hop mimo cooperative systems. shall illustrate relay buffering effective way reduce intrinsic half-duplex penalty cooperative systems. complex interactions queues source node relays modeled average-cost infinite horizon markov decision process (mdp). traditional approach solving mdp problem involves centralized control huge complexity. obtain distributive low complexity solution, introduce linear structure approximates value function associated bellman equation sum per-node value functions. derive distributive two-stage two-winner auction-based control policy function local csi local qsi only. furthermore, estimate best fit approximation parameter, propose distributive online stochastic learning algorithm using stochastic approximation theory. finally, establish technical conditions almost-sure convergence show heavy traffic, proposed low complexity distributive control global optimal.",4 "iterative learning answer set programs context dependent examples. recent years, several frameworks systems proposed extend inductive logic programming (ilp) answer set programming (asp) paradigm. ilp, examples must explained hypothesis together given background knowledge. existing systems, background knowledge examples; however, examples may context-dependent. means examples explained context information, whereas others explained different contexts. paper, capture notion present context-dependent extension learning ordered answer sets framework. extension, contexts used structure background knowledge. propose new iterative algorithm, ilasp2i, exploits feature scale existing ilasp2 system learning tasks large numbers examples. demonstrate gain scalability applying algorithms various learning tasks. results show that, compared ilasp2, newly proposed ilasp2i system two orders magnitude faster use two orders magnitude less memory, whilst preserving average accuracy. paper consideration acceptance tplp.",4 "understanding minimum probability flow rbms various kinds dynamics. energy-based models popular machine learning due elegance formulation relationship statistical physics. among these, restricted boltzmann machine (rbm), staple training algorithm contrastive divergence (cd), prototype recent advancements unsupervised training deep neural networks. however, cd limited theoretical motivation, cases produce undesirable behavior. here, investigate performance minimum probability flow (mpf) learning training rbms. unlike cd, focus approximating intractable partition function via gibbs sampling, mpf proposes tractable, consistent, objective function defined terms taylor expansion kl divergence respect sampling dynamics. propose general form sampling dynamics mpf, explore consequences different choices dynamics training rbms. experimental results show mpf outperforming cd various rbm configurations.",4 "writer identification verification intra-variable individual handwriting. handwriting individual may vary excessively many factors mood, time, space, writing speed, writing medium, utensils etc. therefore, becomes challenging perform automated writer verification/ identification particular set handwritten patterns (e.g. speedy handwriting) person, especially system trained using different set writing patterns (e.g. normal/medium speed) person. however, would interesting experimentally analyze exists implicit characteristic individuality insensitive high intra-variable handwriting. paper, work writer identification/ verification offline bengali handwriting high intra-variability. end, use two separate models writer identification/verification task: (a) hand-crafted features svm model (b) auto-derived features recurrent neural network. experimentation, generated handwriting database 100 writers obtained interesting results training-testing different writing speeds.",4 "anomaly detection motif discovery symbolic representations time series. advent big data hype consistent recollection event logs real-time data sensors, monitoring software machine configuration generated huge amount time-varying data every sector industry. rule-based processing data ceased relevant many scenarios anomaly detection pattern mining entirely accomplished machine. since early 2000s, de-facto standard representing time series symbolic aggregate approximation (sax).in document, present algorithms using representation anomaly detection motif discovery, also known pattern mining, data. propose benchmark anomaly detection algorithms using data cloud monitoring software.",4 "review network traffic analysis prediction techniques. analysis prediction network traffic applications wide comprehensive set areas newly attracted significant number studies. different kinds experiments conducted summarized identify various problems existing computer network applications. network traffic analysis prediction proactive approach ensure secure, reliable qualitative network communication. various techniques proposed experimented analyzing network traffic including neural network based techniques data mining techniques. similarly, various linear non-linear models proposed network traffic prediction. several interesting combinations network analysis prediction techniques implemented attain efficient effective results. paper presents survey various network analysis traffic prediction techniques. uniqueness rules previous studies investigated. moreover, various accomplished areas analysis prediction network traffic summed.",4 "tensor canonical correlation analysis multi-view dimension reduction. canonical correlation analysis (cca) proven effective tool two-view dimension reduction due profound theoretical foundation success practical applications. respect multi-view learning, however, limited capability handling data represented two-view features, many real-world applications, number views frequently many more. although ad hoc way simultaneously exploring possible pairs features numerically deal multi-view data, ignores high order statistics (correlation information) discovered simultaneously exploring features. therefore, work, develop tensor cca (tcca) straightforwardly yet naturally generalizes cca handle data arbitrary number views analyzing covariance tensor different views. tcca aims directly maximize canonical correlation multiple (more two) views. crucially, prove multi-view canonical correlation maximization problem equivalent finding best rank-1 approximation data covariance tensor, solved efficiently using well-known alternating least squares (als) algorithm. consequence, high order correlation information contained different views explored thus reliable common subspace shared features obtained. addition, non-linear extension tcca presented. experiments various challenge tasks, including large scale biometric structure prediction, internet advertisement classification web image annotation, demonstrate effectiveness proposed method.",19 "isolating sources disentanglement variational autoencoders. decompose evidence lower bound show existence term measuring total correlation latent variables. use motivate $\beta$-tcvae (total correlation variational autoencoder), refinement state-of-the-art $\beta$-vae objective learning disentangled representations, requiring additional hyperparameters training. propose principled classifier-free measure disentanglement called mutual information gap (mig). perform extensive quantitative qualitative experiments, restricted non-restricted settings, show strong relation total correlation disentanglement, latent variables model trained using framework.",4 "spectral clustering imbalanced data. spectral clustering sensitive graphs constructed data particularly proximal imbalanced clusters present. show ratio-cut (rcut) normalized cut (ncut) objectives tailored imbalanced data since tend emphasize cut sizes cut values. propose graph partitioning problem seeks minimum cut partitions minimum size constraints partitions deal imbalanced data. approach parameterizes family graphs, adaptively modulating node degrees fixed node set, yield set parameter dependent cuts reflecting varying levels imbalance. solution problem obtained optimizing parameters. present rigorous limit cut analysis results justify approach. demonstrate superiority method unsupervised semi-supervised experiments synthetic real data sets.",19 "automated segmentation pulmonary arteries low-dose ct vessel tracking. present fully automated method top-down segmentation pulmonary arterial tree low-dose thoracic ct images. main basal pulmonary arteries identified near lung hilum searching candidate vessels adjacent known airways, identified previously reported airway segmentation method. model cylinders iteratively fit vessels track lungs. vessel bifurcations detected measuring rate change vessel radii, child vessels segmented initiating new trackers bifurcation points. validation accomplished using novel sparse surface (ss) evaluation metric. ss metric designed quantify magnitude segmentation error per vessel significantly decreasing manual marking burden human user. total 210 arteries 205 veins manually marked across seven test cases. 134/210 arteries correctly segmented, specificity arteries 90%, average segmentation error 0.15 mm. fully-automated segmentation promising method improving lung nodule detection low-dose ct screening scans, separating vessels surrounding iso-intensity objects.",4 "fast vehicle detection aerial imagery. recent years, several real-time near real-time object detectors developed. however object detectors typically designed first-person view images subject large image directly apply well detecting vehicles aerial imagery. though detectors developed aerial imagery, either slow handle multi-scale imagery well. popular yolov2 detector modified vastly improve performance aerial data. modified detector compared faster rcnn several aerial imagery datasets. proposed detector gives near state art performance 4x speed.",4 "priority union generalization discourse grammars. describe implementation carpenter's typed feature formalism, ale, discourse grammar kind proposed scha, polanyi, et al. examine method resolving parallelism-dependent anaphora show coherent feature-structural rendition type grammar uses operations priority union generalization. describe augmentation ale system encompass operations show appropriate choice definition priority union gives desired multiple output examples vp-ellipsis exhibit strict/sloppy ambiguity.",2 "learning generalized reactive policies using deep neural networks. consider problem learning planning, knowledge acquired planning reused plan faster new problem instances. robotic tasks, among others, plan execution captured sequence visual images. domains, propose use deep neural networks learning planning, based learning reactive policy imitates execution traces produced planner. investigate architectural properties deep networks suitable learning long-horizon planning behavior, explore learn, addition policy, heuristic function used classical planners search algorithms a*. results challenging sokoban domain show that, suitable network design, complex decision making policies powerful heuristic functions learned imitation.",4 "inferring missing categorical information noisy sparse web markup. embedded markup web pages seen widespread adoption throughout past years driven standards rdfa microdata initiatives schema.org, recent studies show adoption 39% web pages already 2016. constitutes important information source tasks web search, web page classification knowledge graph augmentation, individual markup nodes usually sparsely described often lack essential information. instance, 26 million nodes describing events within common crawl 2016, 59% nodes provide less six statements 257,000 nodes (0.96%) typed specific event subtypes. nevertheless, given scale diversity web markup data, nodes provide missing information obtained web large quantities, particular categorical properties. data constitutes potential training data inferring missing information significantly augment sparsely described nodes. work, introduce supervised approach inferring missing categorical properties web markup. experiments, conducted properties events movies, show performance 79% 83% f1 score correspondingly, significantly outperforming existing baselines.",4 "enhancement performance road recognition system autonomous robots shadow scenario. road region recognition main feature gaining increasing attention intellectuals helps autonomous vehicle achieve successful navigation without accident. however, different techniques based camera sensor used various researchers outstanding results achieved. despite success, environmental noise like shadow leads inaccurate recognition road region eventually leads accident autonomous vehicle. research, conducted investigation shadow effects, optimized road region recognition system autonomous vehicle introducing algorithm capable detecting eliminating effects shadow. experimental performance system tested compared using following schemes: total positive rate (tpr), false negative rate (fnr), total negative rate (tnr), error rate (err) false positive rate (fpr). performance result system improved road recognition shadow scenario advancement added tremendously successful navigation approaches autonomous vehicle.",4 "strategies conceptual change convolutional neural networks. remarkable feature human beings capacity creative behaviour, referring ability react problems ways novel, surprising, useful. transformational creativity form creativity creative behaviour induced transformation actor's conceptual space, is, representational system actor interprets environment. report, focus ways adapting systems learned representations switch performing one task performing another. describe experimental comparison multiple strategies adaptation learned features, evaluate effectively strategies realizes adaptation, terms amount training, terms ability cope restricted availability training data. show, among things, across handwritten digits, natural images, classical music, adaptive strategies systematically effective baseline method starts learning scratch.",4 "robust named entity recognition idiosyncratic domains. named entity recognition often fails idiosyncratic domains. causes problem depending tasks, entity linking relation extraction. propose generic robust approach high-recall named entity recognition. approach easy train offers strong generalization diverse domain-specific language, news documents (e.g. reuters) biomedical text (e.g. medline). approach based deep contextual sequence learning utilizes stacked bidirectional lstm networks. model trained hundred labeled sentences rely external knowledge. report results f1 scores range 84-94% standard datasets.",4 "apptechminer: mining applications techniques scientific articles. paper presents apptechminer, rule-based information extraction framework automatically constructs knowledge base application areas problem solving techniques. techniques include tools, methods, datasets evaluation metrics. also categorize individual research articles based application areas techniques proposed/improved article. system achieves high average precision (~82%) recall (~84%) knowledge base creation. also performs well application technique assignment individual article (average accuracy ~66%). end, present two use cases presenting trivial information retrieval system extensive temporal analysis usage techniques application areas. present, demonstrate framework domain computational linguistics easily generalized field research.",4 "note: variational encoding protein dynamics benefits maximizing latent autocorrelation. deep variational auto-encoder (vae) frameworks become widely used modeling biomolecular simulation data, emphasize capability vae architecture concurrently maximize timescale latent space inferring reduced coordinate, assists finding slow processes according variational approach conformational dynamics. additionally provide evidence vde framework (hern\'andez et al., 2017), uses autocorrelation loss along time-lagged reconstruction loss, obtains variationally optimized latent coordinate comparison related loss functions. thus recommend leveraging autocorrelation latent space training neural network models biomolecular simulation data better represent slow processes.",15 "differential evolution event-triggered impulsive control. differential evolution (de) simple powerful evolutionary algorithm, widely successfully used various areas. paper, event-triggered impulsive control scheme (eti) introduced improve performance de. impulsive control, concept derives control theory, aims regulating states network instantly adjusting states fraction nodes certain instants, instants determined event-triggered mechanism (etm). introducing impulsive control etm de, hope change search performance population positive way revising positions individuals certain moments. end generation, impulsive control operation triggered update rate population declines equals zero. detail, inspired concepts impulsive control, two types impulses presented within framework de paper: stabilizing impulses destabilizing impulses. stabilizing impulses help individuals lower rankings instantly move desired state determined individuals better fitness values. destabilizing impulses randomly alter positions inferior individuals within range current population. means intelligently modifying positions part individuals two kinds impulses, exploitation exploration abilities whole population meliorated. addition, proposed eti flexible incorporated several state-of-the-art de variants. experimental results cec 2014 benchmark functions exhibit developed scheme simple yet effective, significantly improves performance considered de algorithms.",4 "improving expected improvement algorithm. expected improvement (ei) algorithm popular strategy information collection optimization uncertainty. algorithm widely known greedy, nevertheless enjoys wide use due simplicity ability handle uncertainty noise coherent decision theoretic framework. provide rigorous insight ei, study properties simple setting bayesian optimization domain consists finite grid points. so-called best-arm identification problem, goal allocate measurement effort wisely confidently identify best arm using small number measurements. framework, one show formally ei far optimal. overcome shortcoming, introduce simple modification expected improvement algorithm. surprisingly, simple change results algorithm asymptotically optimal gaussian best-arm identification problems, provably outperforms standard ei order magnitude.",4 "maximum resilience artificial neural networks. deployment artificial neural networks (anns) safety-critical applications poses number new verification certification challenges. particular, ann-enabled self-driving vehicles important establish properties resilience anns noisy even maliciously manipulated sensory input. addressing challenges defining resilience properties ann-based classifiers maximal amount input sensor perturbation still tolerated. problem computing maximal perturbation bounds anns reduced solving mixed integer optimization problems (mip). number mip encoding heuristics developed drastically reducing mip-solver runtimes, using parallelization mip-solvers results almost linear speed-up number (up certain limit) computing cores experiments. demonstrate effectiveness scalability approach means computing maximal resilience bounds number ann benchmark sets ranging typical image recognition scenarios autonomous maneuvering robots.",4 "building telescope look high-dimensional image spaces. image pattern represented probability distribution whose density concentrated different low-dimensional subspaces high-dimensional image space. probability densities astronomical number local modes corresponding typical pattern appearances. related groups modes join form macroscopic image basins represent pattern concepts. recent works use neural networks capture high-order image statistics learn gibbs models capable synthesizing realistic images many patterns. however, characterizing learned probability density uncover hopfield memories model, encoded structure local modes, remains open challenge. work, present novel computational experiments map visualize local mode structure gibbs densities. efficient mapping requires identifying global basins without enumerating countless modes. inspired grenander's jump-diffusion method, propose new mcmc tool called attraction-diffusion (ad) capture macroscopic structure highly non-convex densities measuring metastability local modes. ad involves altering target density magnetization potential penalizing distance known mode running mcmc sample altered density measure stability initial chain state. using low-dimensional generator network facilitate exploration, map image spaces 12,288 dimensions (64 $\times$ 64 pixels rgb). work shows: (1) ad efficiently map highly non-convex probability densities, (2) metastable regions pattern probability densities contain coherent groups images, (3) perceptibility differences training images influences metastability image basins.",19 "fast fractal image compression algorithm using predefined values contrast scaling. paper new fractal image compression algorithm proposed time encoding process considerably reduced. algorithm exploits domain pool reduction approach, along using innovative predefined values contrast scaling factor, s, instead scanning parameter space [0,1]. within approach domain blocks entropies greater threshold considered. novel point, assumed step encoding process, domain block small enough distance shall found range blocks low activity (equivalently low entropy). novel point used find reasonable estimations s, use encoding process predefined values, mentioned above. algorithm examined well-known images. result shows proposed algorithm considerably reduces encoding time producing images approximately quality.",4 "open problem: tightness maximum likelihood semidefinite relaxations. observed interesting, yet unexplained, phenomenon: semidefinite programming (sdp) based relaxations maximum likelihood estimators (mle) tend tight recovery problems noisy data, even mle cannot exactly recover ground truth. several results establish tightness sdp based relaxations regime exact recovery mle possible. however, best knowledge, tightness understood beyond regime. illustrative example, focus generalized procrustes problem.",12 "out-distribution training confers robustness deep neural networks. easiness adversarial instances generated deep neural networks raises fundamental questions functioning concerns use critical systems. paper, draw connection over-generalization adversaries: possible cause adversaries lies models designed make decisions input space, leading inappropriate high-confidence decisions parts input space represented training set. empirically show augmented neural network, trained types adversaries, increase robustness detecting black-box one-step adversaries, i.e. assimilated out-distribution samples, making generation white-box one-step adversaries harder.",4 "chalearn looking people: review events resources. paper reviews historic chalearn looking people (lap) events. started 2011 (with release first kinect device) run challenges related human action/activity gesture recognition. since regularly organized events series competitions covering aspects visual analysis humans. far organized 10 international challenges events field. paper reviews associated events, introduces chalearn lap platform public resources (including code, data preprints papers) related organized events available. also provide discussion perspectives chalearn lap activities.",4 "unified approach multi-scale deep hand-crafted features defocus estimation. paper, introduce robust synergetic hand-crafted features simple efficient deep feature convolutional neural network (cnn) architecture defocus estimation. paper systematically analyzes effectiveness different features, shows feature compensate weaknesses features concatenated. full defocus map estimation, extract image patches strong edges sparsely, use deep hand-crafted feature extraction. order reduce degree patch-scale dependency, also propose multi-scale patch extraction strategy. sparse defocus map generated using neural network classifier followed probability-joint bilateral filter. final defocus map obtained sparse defocus map guidance edge-preserving filtered input image. experimental results show algorithm superior state-of-the-art algorithms terms defocus estimation. work used applications segmentation, blur magnification, all-in-focus image generation, 3-d estimation.",4 "made: masked autoencoder distribution estimation. lot recent interest designing neural network models estimate distribution set examples. introduce simple modification autoencoder neural networks yields powerful generative models. method masks autoencoder's parameters respect autoregressive constraints: input reconstructed previous inputs given ordering. constrained way, autoencoder outputs interpreted set conditional probabilities, product, full joint probability. also train single network decompose joint probability multiple different orderings. simple framework applied multiple architectures, including deep ones. vectorized implementations, gpus, simple fast. experiments demonstrate approach competitive state-of-the-art tractable distribution estimators. test time, method significantly faster scales better autoregressive estimators.",4 "recover missing sensor data iterative imputing network. sensor data playing important role machine learning tasks, complementary human-annotated data usually rather costly. however, due systematic accidental mis-operations, sensor data comes often variety missing values, resulting considerable difficulties follow-up analysis visualization. previous work imputes missing values interpolating observational feature space, without consulting latent (hidden) dynamics. contrast, model captures latent complex temporal dynamics summarizing observation's context novel iterative imputing network, thus significantly outperforms previous work benchmark beijing air quality meteorological dataset. model also yields consistent superiority methods cases different missing rates.",4 "incorporating road networks territory design. given set basic areas, territory design problem asks create predefined number territories, containing least one basic area, objective function optimized. desired properties territories often include reasonable balance, compact form, contiguity small average journey times usually encoded objective function formulated constraints. address territory design problem developing graph theoretic models also consider underlying road network. derived graph models enable us tackle territory design problem modifying graph partitioning algorithms mixed integer programming formulations objective planning problem taken account. test compare algorithms several real world instances.",12 "learning certifiably optimal rule lists categorical data. present design implementation custom discrete optimization technique building rule lists categorical feature space. algorithm produces rule lists optimal training performance, according regularized empirical risk, certificate optimality. leveraging algorithmic bounds, efficient data structures, computational reuse, achieve several orders magnitude speedup time massive reduction memory consumption. demonstrate approach produces optimal rule lists practical problems seconds. results indicate possible construct optimal sparse rule lists approximately accurate compas proprietary risk prediction tool data broward county, florida, completely interpretable. framework novel alternative cart decision tree methods interpretable modeling.",19 "new computational framework 2d shape-enclosing contours. paper, new framework one-dimensional contour extraction discrete two-dimensional data sets presented. contour extraction important many scientific fields digital image processing, computer vision, pattern recognition, etc. novel framework includes (but limited to) algorithms dilated contour extraction, contour displacement, shape skeleton extraction, contour continuation, shape feature based contour refinement contour simplification. many new techniques depend strongly application delaunay tessellation. order demonstrate versatility novel toolbox approach, contour extraction techniques presented applied scientific problems material science, biology heavy ion physics.",4 "unsupervised video understanding reconciliation posture similarities. understanding human activity able explain detail surpasses mere action classification far complexity value. challenge thus describe activity basis fundamental constituents, individual postures distinctive transitions. supervised learning fine-grained representation based elementary poses tedious scale. therefore, propose completely unsupervised deep learning procedure based solely video sequences, starts scratch without requiring pre-trained networks, predefined body models, keypoints. combinatorial sequence matching algorithm proposes relations frames subsets training data, cnn reconciling transitivity conflicts different subsets learn single concerted pose embedding despite changes appearance across sequences. without manual annotation, model learns structured representation postures temporal development. model enables retrieval similar postures also temporal super-resolution. additionally, based recurrent formulation, next frames synthesized.",4 "deep convolutional auto-encoder pooling - unpooling layers caffe. paper presents development several models deep convolutional auto-encoder caffe deep learning framework experimental evaluation example mnist dataset. created five models convolutional auto-encoder differ architecturally presence absence pooling unpooling layers auto-encoder's encoder decoder parts. results show developed models provide good results dimensionality reduction unsupervised clustering tasks, small classification errors used learned internal code input supervised linear classifier multi-layer perceptron. best results provided model encoder part contains convolutional pooling layers, followed analogous decoder part deconvolution unpooling layers without use switch variables decoder part. paper also discusses practical details creation deep convolutional auto-encoder popular caffe deep learning framework. believe approach results presented paper could help researchers build efficient deep neural network architectures future.",4 "languages cool expand: allometric scaling decreasing need new words. analyze occurrence frequencies 15 million words recorded millions books published past two centuries seven different languages. languages chronological subsets data confirm two scaling regimes characterize word frequency distributions, common words obeying classic zipf law. using corpora unprecedented size, test allometric scaling relation corpus size vocabulary size growing languages demonstrate decreasing marginal need new words, feature likely related underlying correlations words. calculate annual growth fluctuations word use decreasing trend corpus size increases, indicating slowdown linguistic evolution following language expansion. ""cooling pattern"" forms basis third statistical regularity, unlike zipf heaps law, dynamical nature.",15 "jabalin: comprehensive computational model modern standard arabic verbal morphology based traditional arabic prosody. computational handling modern standard arabic challenge field natural language processing due highly rich morphology. however, several authors pointed arabic morphological system fact extremely regular. existing arabic morphological analyzers exploited regularity variable extent, yet believe still scope improvement. taking inspiration traditional arabic prosody, designed implemented compact simple morphological system opinion takes advantage regularities encountered arabic morphological system. output system large-scale lexicon inflected forms subsequently used create online interface morphological analyzer arabic verbs. jabalin online interface available http://elvira.lllf.uam.es/jabalin/, hosted lli-uam lab. generation system also available gnu gpl 3 license.",4 "maximum entropy models generation expressive music. context contemporary monophonic music, expression seen difference musical performance symbolic representation, i.e. musical score. paper, show maximum entropy (maxent) models used generate musical expression order mimic human performance. training corpus, professional pianist play 150 melodies jazz, pop, latin jazz. results show good predictive power, validating choice model. additionally, set listening test whose results reveal average, people significantly prefer melodies generated maxent model ones without expression, fully random expression. furthermore, cases, maxent melodies almost popular human performed ones.",4 "3d fully convolutional neural network random walker segment esophagus ct. precise delineation organs risk (oar) crucial task radiotherapy treatment planning, aims delivering high dose tumour sparing healthy tissues. recent years algorithms showed high performance possibility automate task many oar. however, oar precise delineation remains challenging. esophagus versatile shape poor contrast among structures. tackle issues propose 3d fully (convolutional neural network (cnn) driven random walk (rw) approach automatically segment esophagus ct. first, soft probability map generated cnn. active contour model (acm) fitted probability map get first estimation center line. outputs cnn acm used addition ct hounsfield values drive rw. evaluation training done 50 cts peer reviewed esophagus contours. results assessed regarding spatial overlap shape similarities. generated contours showed mean dice coefficient 0.76, average symmetric square distance 1.36 mm average hausdorff distance 11.68 compared reference. figures translate good agreement reference contours increase accuracy compared methods. show employing cnn accurate estimations esophagus location obtained refined post processing rw step. one main advantages compared previous methods network performs convolutions 3d manner, fully exploiting 3d spatial context performing efficient precise volume-wise prediction. whole segmentation process fully automatic yields esophagus delineations good agreement used gold standard, showing compete previously published methods.",4 "consistency-based model belief change: preliminary report. present general, consistency-based framework belief change. informally, revising k a, begin incorporate much k consistently possible. formally, knowledge base k sentence expressed, via renaming propositions k, separate languages. using maximization process, assume languages insofar consistently possible. lastly, express resultant knowledge base single language. may one way extended k: choice revision, one ``extension'' represents revised state; alternately revision consists intersection extensions. general formulation approach flexible enough express approaches revision update, merging knowledge bases, incorporation static dynamic integrity constraints. framework differs work based ordinal conditional functions, notably respect iterated revision. argue approach well-suited implementation: choice revision operator gives better complexity results general revision; approach expressed terms finite knowledge base; scope revision restricted propositions mentioned sentence revision a.",4 "unsupervised prototype learning associative-memory network. unsupervised learning generalized hopfield associative-memory network investigated work. first, prove (generalized) hopfield model equivalent semi-restricted boltzmann machine layer visible neurons another layer hidden binary neurons, could serve building block multilayered deep-learning system. demonstrate hopfield network learn form faithful internal representation observed samples, learned memory patterns prototypes input data. furthermore, propose spectral method extract small set concepts (idealized prototypes) concise summary abstraction empirical data.",4 "ontology based scene creation development automated vehicles. introduction automated vehicles without permanent human supervision demands functional system description, including functional system boundaries comprehensive safety analysis. inputs technical development identified analyzed scenario-based approach. furthermore, establish economical test release process, large number scenarios must identified obtain meaningful test results. experts well identify scenarios difficult handle unlikely happen. however, experts unlikely identify scenarios possible based knowledge hand. expert knowledge modeled computer aided processing may help purpose providing wide range scenarios. contribution reviews ontologies knowledge-based systems field automated vehicles, proposes generation traffic scenes natural language basis scenario creation.",4 "intelligent parameter tuning optimization-based iterative ct reconstruction via deep reinforcement learning. number image-processing problems formulated optimization problems. objective function typically contains several terms specifically designed different purposes. parameters front terms used control relative weights among them. critical importance tune parameters, quality solution depends values. tuning parameter relatively straightforward task human, one intelligently determine direction parameter adjustment based solution quality. yet manual parameter tuning tedious many cases, becomes impractical number parameters exist problem. aiming solving problem, paper proposes approach employs deep reinforcement learning train system automatically adjust parameters human-like manner. demonstrate idea example problem optimization-based iterative ct reconstruction pixel-wise total-variation regularization term. set parameter tuning policy network (ptpn), maps ct image patch output specifies direction amplitude parameter patch center adjusted. train ptpn via end-to-end reinforcement learning procedure. demonstrate guidance trained ptpn parameter tuning pixel, reconstructed ct images attain quality similar better reconstructed manually tuned parameters.",15 "approximation beats concentration? approximation view inference smooth radial kernels. positive definite kernels associated reproducing kernel hilbert spaces provide mathematically compelling practically competitive framework learning data. paper take approximation theory point view explore various aspects smooth kernels related inferential properties. analyze eigenvalue decay kernels operators matrices, properties eigenfunctions/eigenvectors ""fourier"" coefficients functions kernel space restricted discrete set data points. also investigate fitting capacity kernels, giving explicit bounds fat shattering dimension balls reproducing kernel hilbert spaces. interestingly, properties make kernels effective approximators functions ""native"" kernel space, also limit capacity represent arbitrary functions. discuss various implications, including gradient descent type methods. important note bounds measure independent. moreover, least moderate dimension, bounds eigenvalues much tighter bounds obtained usual matrix concentration results. example, see eigenvalues kernel matrices show nearly exponential decay constants depending kernel domain. call ""approximation beats concentration"" phenomenon even data sampled probability distribution, aspects better understood terms approximation theory.",4 "sparse partially linear additive models. generalized partially linear additive model (gplam) flexible interpretable approach building predictive models. combines features additive manner, allowing either linear nonlinear effect response. however, choice features treat linear nonlinear typically assumed known. thus, make gplam viable approach situations little known $a~priori$ features, one must overcome two primary model selection challenges: deciding features include model determining features treat nonlinearly. introduce sparse partially linear additive model (splam), combines model fitting $both$ model selection challenges single convex optimization problem. splam provides bridge lasso sparse additive models. statistical oracle inequality thorough simulation, demonstrate splam outperform methods across broad spectrum statistical regimes, including high-dimensional ($p\gg n$) setting. develop efficient algorithms applied real data sets half million samples 45,000 features excellent predictive performance.",19 "deep disentangled representations volumetric reconstruction. introduce convolutional neural network inferring compact disentangled graphical description objects 2d images used volumetric reconstruction. network comprises encoder twin-tailed decoder. encoder generates disentangled graphics code. first decoder generates volume, second decoder reconstructs input image using novel training regime allows graphics code learn separate representation 3d object description lighting pose conditions. demonstrate method generating volumes disentangled graphical descriptions images videos faces chairs.",4 "equitability, interval estimation, statistical power. analysis high-dimensional dataset, common approach test null hypothesis statistical independence variable pairs using non-parametric measure dependence. however, approach attempts identify non-trivial relationship matter weak, often identifies many relationships useful. needed way identifying smaller set relationships merit detailed analysis. formally present characterize equitability, property measures dependence aims overcome challenge. notionally, equitable statistic statistic that, given measure noise, assigns similar scores equally noisy relationships different types [reshef et al. 2011]. begin formalizing idea via new object called interpretable interval, functions interval estimate amount noise relationship unknown type. define equitable statistic one small interpretable intervals. draw equivalence interval estimation hypothesis testing show moderate assumptions equitable statistic one yields well powered tests distinguishing trivial non-trivial relationships kinds also non-trivial relationships different strengths. means equitability allows us specify threshold relationship strength $x_0$ search relationships kinds strength greater $x_0$. thus, equitability thought strengthening power independence enables fruitful analysis data sets small number strong, interesting relationships large number weaker ones. conclude demonstration two equivalent characterizations equitability used evaluate equitability statistic practice.",12 "interaction entropy-based discretization sample size: empirical study. empirical investigation interaction sample size discretization - case entropy-based method caim (class-attribute interdependence maximization) - undertaken evaluate impact potential bias introduced data mining performance metrics due variation sample size impacts discretization process. particular interest effect discretizing within cross-validation folds averse outside discretization folds. previous publications suggested discretizing externally bias performance results; however, thorough review literature found empirical evidence support assertion. investigation involved construction 117,000 models seven distinct datasets uci (university california-irvine) machine learning library multiple modeling methods across variety configurations sample size discretization, unique ""setup"" independently replicated ten times. analysis revealed significant optimistic bias sample sizes decreased discretization employed. study also revealed may relationship interaction produces bias numbers types predictor attributes, extending ""curse dimensionality"" concept feature selection discretization realm. directions exploration laid out, well general guidelines proper application discretization light results.",19 "linguistic reflexes well-being happiness echo. different theories posit different sources feelings well-being happiness. appraisal theory grounds emotional responses goals desires fulfillment, lack fulfillment. self determination theory posits basis well-being rests assessment competence, autonomy, social connection. surveys measure happiness empirically note people require basic needs met food shelter, beyond tend happiest socializing, eating sex. analyze corpus private microblogs well-being application called echo, users label written post daily events happiness score 1 9. goal ground linguistic descriptions events users experience theories well-being happiness, examine extent different theoretical accounts explain variance happiness scores. show recurrent event types, obligation incompetence, affect people's feelings well-being captured current lexical semantic resources.",4 "investigation using vae i-vector speaker verification. new system i-vector speaker recognition based variational autoencoder (vae) investigated. vae promising approach developing accurate deep nonlinear generative models complex data. experiments show vae provides speaker embedding effectively trained unsupervised manner. llr estimate vae developed. experiments nist sre 2010 data demonstrate correctness. additionally, show performance vae-based system i-vectors space close diagonal plda. several interesting results also observed experiments $\beta$-vae. particular, found $\beta\ll 1$, vae trained capture features complex input data distributions effective way, hard obtain standard vae ($\beta=1$).",4 "point-wise convolutional neural network. deep learning 3d data reconstructed point clouds cad models received great research interests recently. however, capability using point clouds convolutional neural network far fully explored. technical report, present convolutional neural network semantic segmentation object recognition 3d point clouds. core network point-wise convolution, convolution operator applied point point cloud. fully convolutional network design, simple implement, yield competitive accuracy semantic segmentation object recognition task.",4 "salsa: novel dataset multimodal group behavior analysis. studying free-standing conversational groups (fcgs) unstructured social settings (e.g., cocktail party ) gratifying due wealth information available group (mining social networks) individual (recognizing native behavioral personality traits) levels. however, analyzing social scenes involving fcgs also highly challenging due difficulty extracting behavioral cues target locations, speaking activity head/body pose due crowdedness presence extreme occlusions. end, propose salsa, novel dataset facilitating multimodal synergetic social scene analysis, make two main contributions research automated social interaction analysis: (1) salsa records social interactions among 18 participants natural, indoor environment 60 minutes, poster presentation cocktail party contexts presenting difficulties form low-resolution images, lighting variations, numerous occlusions, reverberations interfering sound sources; (2) alleviate problems facilitate multimodal analysis recording social interplay using four static surveillance cameras sociometric badges worn participant, comprising microphone, accelerometer, bluetooth infrared sensors. addition raw data, also provide annotations concerning individuals' personality well position, head, body orientation f-formation information entire event duration. extensive experiments state-of-the-art approaches, show (a) limitations current methods (b) recorded multiple cues synergetically aid automatic analysis social interactions. salsa available http://tev.fbk.eu/salsa.",4 "local color contrastive descriptor image classification. image representation classification two fundamental tasks towards multimedia content retrieval understanding. idea shape texture information (e.g. edge orientation) key features visual representation ingrained dominated current multimedia computer vision communities. number low-level features proposed computing local gradients (e.g. sift, lbp hog), achieved great successes numerous multimedia applications. paper, present simple yet efficient local descriptor image classification, referred local color contrastive descriptor (lccd), leveraging neural mechanisms color contrast. idea originates observation neural science color shape information linked inextricably visual cortical processing. color contrast yields key information visual color perception provides strong linkage color shape. propose novel contrastive mechanism compute color contrast spatial location multiple channels. color contrast computed measuring \emph{f}-divergence color distributions two regions. descriptor enriches local image representation color contrast information. verified experimentally compensate strongly shape based descriptor (e.g. sift), keeping computationally simple. extensive experimental results image classification show descriptor improves performance sift substantially combinations, achieves state-of-the-art performance three challenging benchmark datasets. improves recent deep learning model (decaf) [1] largely accuracy 40.94% 49.68% large scale sun397 database. codes lccd available.",4 "real time image saliency black box classifiers. work develop fast saliency detection method applied differentiable image classifier. train masking model manipulate scores classifier masking salient parts input image. model generalises well unseen images requires single forward pass perform saliency detection, therefore suitable use real-time systems. test approach cifar-10 imagenet datasets show produced saliency maps easily interpretable, sharp, free artifacts. suggest new metric saliency test method imagenet object localisation task. achieve results outperforming weakly supervised methods.",19 "3d scanning: comprehensive survey. paper provides overview 3d scanning methodologies technologies proposed existing scientific industrial literature. throughout paper, various types related techniques reviewed, consist, mainly, close-range, aerial, structure-from-motion terrestrial photogrammetry, mobile, terrestrial airborne laser scanning, well time-of-flight, structured-light phase-comparison methods, along comparative combinational studies, latter intended help make clearer distinction relevance reliability possible choices. moreover, outlier detection surface fitting procedures discussed concisely, necessary post-processing stages.",4 "collaborative mechanism crowdsourcing prediction problems. machine learning competitions netflix prize proven reasonably successful method ""crowdsourcing"" prediction tasks. competitions number weaknesses, particularly incentive structure create participants. propose new approach, called crowdsourced learning mechanism, participants collaboratively ""learn"" hypothesis given prediction task. approach draws heavily concept prediction market, traders bet likelihood future event. framework, mechanism continues publish current hypothesis, participants modify hypothesis wagering update. critical incentive property participant profit amount scales according much update improves performance released test set.",4 "functional regularized least squares classi cation operator-valued kernels. although operator-valued kernels recently received increasing interest various machine learning functional data analysis problems multi-task learning functional regression, little attention paid understanding associated feature spaces. paper, explore potential adopting operator-valued kernel feature space perspective analysis functional data. extend regularized least squares classification (rlsc) algorithm cover situations multiple functions per observation. experiments sound recognition problem show proposed method outperforms classical rlsc algorithm.",4 "perspective deep imaging. combination tomographic imaging deep learning, machine learning general, promises empower image analysis also image reconstruction. latter aspect considered perspective article emphasis medical imaging develop new generation image reconstruction theories techniques. direction might lead intelligent utilization domain knowledge big data, innovative approaches image reconstruction, superior performance clinical preclinical applications. realize full impact machine learning medical imaging, major challenges must addressed.",16 "ellipsoidal rounding nonnegative matrix factorization noisy separability. present numerical algorithm nonnegative matrix factorization (nmf) problems noisy separability. nmf problem separability stated one finding vertices convex hull data points. research interest paper find vectors close vertices possible situation noise added data points. algorithm designed capture shape convex hull data points using enclosing ellipsoid. show algorithm correctness robustness properties theoretical practical perspectives; correctness means data points contain noise, algorithm find vertices convex hull; robustness means data points contain noise, algorithm find near-vertices. finally, apply algorithm document clustering, report experimental results.",19 "deep scattering: rendering atmospheric clouds radiance-predicting neural networks. present technique efficiently synthesizing images atmospheric clouds using combination monte carlo integration neural networks. intricacies lorenz-mie scattering high albedo cloud-forming aerosols make rendering clouds---e.g. characteristic silverlining ""whiteness"" inner body---challenging methods based solely monte carlo integration diffusion theory. approach problem differently. instead simulating light transport rendering, pre-learn spatial directional distribution radiant flux tens cloud exemplars. render new scene, sample visible points cloud and, each, extract hierarchical 3d descriptor cloud geometry respect shading location light source. descriptor input deep neural network predicts radiance function shading configuration. make key observation progressively feeding hierarchical descriptor network enhances network's ability learn faster predict high accuracy using coefficients. also employ block design residual connections improve performance. gpu implementation method synthesizes images clouds nearly indistinguishable reference solution within seconds interactively. method thus represents viable solution applications cloud design and, thanks temporal stability, also high-quality production animated content.",4 "dragon: computation graph virtual machine based deep learning framework. deep learning made great progress years. however, still difficult master implement various models different researchers may release code based different frameworks interfaces. paper, proposed computation graph based framework aims introduce well-known interfaces. help lot reproducing newly model transplanting models implemented frameworks. additionally, implement numerous recent models covering computer vision nature language processing. demonstrate framework suffer model-starving much easier make full use works already done.",4 "latent intention dialogue models. developing dialogue agent capable making autonomous decisions communicating natural language one long-term goals machine learning research. traditional approaches either rely hand-crafting small state-action set applying reinforcement learning scalable constructing deterministic models learning dialogue sentences fail capture natural conversational variability. paper, propose latent intention dialogue model (lidm) employs discrete latent variable learn underlying dialogue intentions framework neural variational inference. goal-oriented dialogue scenario, latent intentions interpreted actions guiding generation machine responses, refined autonomously reinforcement learning. experimental evaluation lidm shows model out-performs published benchmarks corpus-based human evaluation, demonstrating effectiveness discrete latent variable models learning goal-oriented dialogues.",4 "applying mdl learning best model granularity. minimum description length (mdl) principle solidly based provably ideal method inference using kolmogorov complexity. test theory behaves practice general problem model selection: learning best model granularity. performance model depends critically granularity, example choice precision parameters. high precision generally involves modeling accidental noise low precision may lead confusion models distinguished. precision often determined ad hoc. mdl best model one compresses two-part code data set: embodies ``occam's razor.'' two quite different experimental settings theoretical value determined using mdl coincides best value found experimentally. first experiment task recognize isolated handwritten characters one subject's handwriting, irrespective size orientation. based new modification elastic matching, using multiple prototypes per character, optimal prediction rate predicted learned parameter (length sampling interval) considered likely mdl, shown coincide best value found experimentally. second experiment task model robot arm two degrees freedom using three layer feed-forward neural network need determine number nodes hidden layer giving best modeling performance. optimal model (the one extrapolizes best unseen examples) predicted number nodes hidden layer considered likely mdl, found coincide best value found experimentally.",15 "slam objects using nonparametric pose graph. mapping self-localization unknown environments fundamental capabilities many robotic applications. tasks typically involve identification objects unique features landmarks, requires objects detected assigned unique identifier maintained viewed different perspectives different images. \textit{data association} \textit{simultaneous localization mapping} (slam) problems are, individually, well-studied literature. two problems inherently tightly coupled, well-addressed. without accurate slam, possible data associations combinatorial become intractable easily. without accurate data association, error slam algorithms diverge easily. paper proposes novel nonparametric pose graph models data association slam single framework. algorithm introduced alternate inferring data association performing slam. experimental results show approach new capability associating object detections localizing objects time, leading significantly better performance data association slam problems achieved considering one ignoring imperfections other.",4 "hybrid exact-aco algorithm joint scheduling, power cluster assignment cooperative wireless networks. base station cooperation (bsc) recently arisen promising way increase capacity wireless network. implementing bsc adds new design dimension classical wireless network design problem: define subset base stations (clusters) coordinate serve user. though problem forming clusters extensively discussed technical point view, still lack effective optimization models representation algorithms solution. work, make step towards filling gap: 1) generalize classical network design problem adding cooperation additional decision dimension; 2) develop strong formulation resulting problem; 3) define new hybrid solution algorithm combines exact large neighborhood search ant colony optimization. finally, assess performance new model algorithm set realistic instances wimax network.",12 "interpretable pedagogical examples. teachers intentionally pick informative examples show students. however, teacher student neural networks, examples teacher network learns give, although effective teaching student, typically uninterpretable. show training student teacher iteratively, rather jointly, produce interpretable teaching strategies. evaluate interpretability (1) measuring similarity teacher's emergent strategies intuitive strategies domain (2) conducting human experiments evaluate effective teacher's strategies teaching humans. show teacher network learns select generate interpretable, pedagogical examples teach rule-based, probabilistic, boolean, hierarchical concepts.",4 "learning $3$d-filtermap deep convolutional neural networks. present novel compact architecture deep convolutional neural networks (cnns) paper, termed $3$d-filtermap convolutional neural networks ($3$d-fm-cnns). convolution layer $3$d-fm-cnn learns compact representation filters, named $3$d-filtermap, instead set independent filters conventional convolution layer. filters extracted $3$d-filtermap overlapping $3$d submatrics weight sharing among nearby filters, filters convolved input generate output convolution layer $3$d-fm-cnn. due weight sharing scheme, parameter size $3$d-filtermap much smaller filters learned conventional convolution layer $3$d-filtermap generates number filters. work fundamentally different network compression literature reduces size learned large network sense small network directly learned scratch. experimental results demonstrate $3$d-fm-cnn enjoys small parameter space learning compact $3$d-filtermaps, achieving performance compared baseline cnns learn number filters generated corresponding $3$d-filtermap.",4 "3d binary signatures. paper, propose novel binary descriptor 3d point clouds. proposed descriptor termed 3d binary signature (3dbs) motivated matching efficiency binary descriptors 2d images. 3dbs describes keypoints point clouds binary vector resulting extremely fast matching. method uses keypoints standard keypoint detectors. descriptor built constructing local reference frame aligning local surface patch accordingly. local surface patch constitutes identifying nearest neighbours based upon angular constraint among them. points ordered respect distance keypoints. normals ordered pairs keypoints projected axes relative magnitude used assign binary digit. vector thus constituted used signature representing keypoints. matching done using hamming distance. show 3dbs outperforms state art descriptors various evaluation metrics.",4 "computational content analysis negative tweets obesity, diet, diabetes, exercise. social media based digital epidemiology potential support faster response deeper understanding public health related threats. study proposes new framework analyze unstructured health related textual data via twitter users' post (tweets) characterize negative health sentiments non-health related concerns relations corpus negative sentiments, regarding diet diabetes exercise, obesity (ddeo). collection 6 million tweets one month, study identified prominent topics users relates negative sentiments. proposed framework uses two text mining methods, sentiment analysis topic modeling, discover negative topics. negative sentiments twitter users support literature narratives many morbidity issues associated ddeo linkage obesity diabetes. framework offers potential method understand publics' opinions sentiments regarding ddeo. importantly, research provides new opportunities computational social scientists, medical experts, public health professionals collectively address ddeo-related issues.",4 "searching objects using structure indoor scenes. identify location objects particular class, passive computer vision system generally processes regions image finally output regions. however, use structure scene search objects without processing entire image. propose search technique sequentially processes image regions regions likely correspond query class object explored earlier. frame problem markov decision process use imitation learning algorithm learn search strategy. since structure scene essential search, work indoor scene images contain unary scene context information object-object context scene. perform experiments nyu-depth v2 dataset show unary scene context features alone achieve significantly high average precision processing 20-25\% regions classes like bed sofa. considering object-object context along scene context features, performance improved classes like counter, lamp, pillow sofa.",4 "automatic classification complexity nonfiction texts portuguese early school years. recent research shows brazilian students serious problems regarding reading skills. full development skill key academic professional future every citizen. tools classifying complexity reading materials children aim improve quality model teaching reading text comprehension. english, fengs work [11] considered state-of-art grade level prediction achieved 74% accuracy automatically classifying 4 levels textual complexity close school grades. classifiers nonfiction texts close grades portuguese. article, propose scheme manual annotation texts 5 grade levels, used customized reading avoid lack interest students advanced reading blocking still need make progress. obtained 52% accuracy classifying texts 5 levels 74% 3 levels. results prove promising compared state-of-art work.9",4 "unsupervised real-to-virtual domain unification end-to-end highway driving. spectrum vision-based autonomous driving, vanilla end-to-end models interpretable suboptimal performance, mediated perception models require additional intermediate representations segmentation masks detection bounding boxes, whose annotation prohibitively expensive move larger scale. raw images existing intermediate representations also loaded nuisance details irrelevant prediction vehicle commands, e.g. style car front view beyond road boundaries. critically, prior works fail deal notorious domain shift merge data collected different sources, greatly hinders model generalization ability. work, address limitations taking advantage virtual data collected driving simulators, present du-drive, unsupervised real virtual domain unification framework end-to-end driving. transforms real driving data canonical representation virtual domain, vehicle control commands predicted. framework several advantages: 1) maps driving data collected different source distributions unified domain, 2) takes advantage annotated virtual data free obtain, 3) learns interpretable, canonical representation driving image specialized vehicle command prediction. extensive experiments two public highway driving datasets clearly demonstrate performance superiority interpretive capability du-drive.",4 "optimal black-box reductions optimization objectives. diverse world machine learning applications given rise plethora algorithms optimization methods, finely tuned specific regression classification task hand. reduce complexity algorithm design machine learning reductions: develop reductions take method developed one setting apply entire spectrum smoothness strong-convexity applications. furthermore, unlike existing results, new reductions optimal practical. show new reductions give rise new faster running times training linear classifiers various families loss functions, conclude experiments showing successes also practice.",12 "robust artificial neural networks outlier detection. technical report. large outliers break linear nonlinear regression models. robust regression methods allow one filter outliers building model. replacing traditional least squares criterion least trimmed squares criterion, half data treated potential outliers, one fit accurate regression models strongly contaminated data. high-breakdown methods become well established linear regression, started applied non-linear regression recently. work, examine problem fitting artificial neural networks contaminated data using least trimmed squares criterion. introduce penalized least trimmed squares criterion prevents unnecessary removal valid data. training anns leads challenging non-smooth global optimization problem. compare efficiency several derivative-free optimization methods solving it, show approach identifies outliers correctly anns used nonlinear regression.",12 "emojis predictable?. emojis ideograms naturally combined plain text visually complement condense meaning message. despite widely used social media, underlying semantics received little attention natural language processing standpoint. paper, investigate relation words emojis, studying novel task predicting emojis evoked text-based tweet messages. train several models based long short-term memory networks (lstms) task. experimental results show neural model outperforms two baselines well humans solving task, suggesting computational models able better capture underlying semantics emojis.",4 "multi-timescale memory dynamics reinforcement learning network attention-gated memory. learning memory intertwined brain relationship core several recent neural network models. particular, attention-gated memory tagging model (augment) reinforcement learning network emphasis biological plausibility memory dynamics learning. find augment network solve hierarchical tasks, higher-level stimuli maintained long time, lower-level stimuli need remembered forgotten shorter timescale. overcome limitation, introduce hybrid augment, leaky short-timescale non-leaky long-timescale units memory, allow exchange lower-level information maintaining higher-level one, thus solving hierarchical distractor tasks.",16 "efficient attention using fixed-size memory representation. standard content-based attention mechanism typically used sequence-to-sequence models computationally expensive requires comparison large encoder decoder states time step. work, propose alternative attention mechanism based fixed size memory representation efficient. technique predicts compact set k attention contexts encoding lets decoder compute efficient lookup need consult memory. show approach performs on-par standard attention mechanism yielding inference speedups 20% real-world translation tasks tasks longer sequences. visualizing attention scores demonstrate models learn distinct, meaningful alignments.",4 "quantifying prosodic variability middle english alliterative poetry. interest mathematical structure poetry dates back least 19th century: retiring mathematics position, j. j. sylvester wrote book prosody called $\textit{the laws verse}$. today interest computer analysis poems, paper discusses statistical approach applied task. starting definition middle english alliteration is, $\textit{sir gawain green knight}$ william langland's $\textit{piers plowman}$ used illustrate methodology. theory first developed analyzing data riemannian manifold turns applicable strings allowing one compute generalized mean variance textual data, applied poems above. ratio two variances produces analogue f test, resampling allows p-values estimated. consequently, methodology provides way compare prosodic variability two texts.",19 "symbolic sat-based algorithm almost-sure reachability small strategies pomdps. pomdps standard models probabilistic planning problems, agent interacts uncertain environment. study problem almost-sure reachability, given set target states, question decide whether policy ensure target set reached probability 1 (almost-surely). general problem exptime-complete, many practical cases policies small amount memory suffice. moreover, existing solution problem explicit, first requires construct explicitly exponential reduction belief-support mdp. work, first study existence observation-stationary strategies, np-complete, small-memory strategies. present symbolic algorithm efficient encoding sat using sat solver problem. report experimental results demonstrating scalability symbolic (sat-based) approach.",4 "neural discourse modeling conversations. deep neural networks shown recent promise many language-related tasks modeling conversations. extend rnn-based sequence sequence models capture long range discourse across many turns conversation. perform sensitivity analysis much additional context affects performance, provide quantitative qualitative evidence models able capture discourse relationships across multiple utterances. results quantifies adding additional rnn layer modeling discourse improves quality output utterances providing previous conversation input also improves performance. searching generated outputs specific discourse markers show neural discourse models exhibit increased coherence cohesion conversations.",4 "generative models model criticism via optimized maximum mean discrepancy. propose method optimize representation distinguishability samples two probability distributions, maximizing estimated power statistical test based maximum mean discrepancy (mmd). optimized mmd applied setting unsupervised learning generative adversarial networks (gan), model attempts generate realistic samples, discriminator attempts tell apart data samples. context, mmd may used two roles: first, discriminator, either directly samples, features samples. second, mmd used evaluate performance generative model, testing model's samples reference data set. latter role, optimized mmd particularly helpful, gives interpretable indication model data distributions differ, even cases individual model samples easily distinguished either eye classifier.",19 "new image compression gradient haar wavelet. development human communications usage visual communications also increased. advancement image compression methods one main reasons enhancement. paper first presents main modes image compression methods jpeg jpeg2000 without mathematical details. also, paper describes gradient haar wavelet transforms order construct preliminary image compression algorithm. then, new image compression method proposed based preliminary image compression algorithm improve standards image compression. new method compared original modes jpeg jpeg2000 (based haar wavelet) image quality measures mae, psnar, ssim. image quality statistical results confirm boost image compression standards. suggested new method used part image compression standard.",4 "improving neural machine translation models monolingual data. neural machine translation (nmt) obtained state-of-the art performance several language pairs, using parallel data training. target-side monolingual data plays important role boosting fluency phrase-based statistical machine translation, investigate use monolingual data nmt. contrast previous work, combines nmt models separately trained language models, note encoder-decoder nmt architectures already capacity learn information language model, explore strategies train monolingual data without changing neural network architecture. pairing monolingual training data automatic back-translation, treat additional parallel training data, obtain substantial improvements wmt 15 task english<->german (+2.8-3.7 bleu), low-resourced iwslt 14 task turkish->english (+2.1-3.4 bleu), obtaining new state-of-the-art results. also show fine-tuning in-domain monolingual parallel data gives substantial improvements iwslt 15 task english->german.",4 using dempster-shafer scheme diagnostic expert system shell. paper discusses expert system shell integrates rule-based reasoning dempster-shafer evidence combination scheme. domain knowledge stored rules associated belief functions. reasoning component uses combination forward backward inferencing mechanisms allow interaction users mixed-initiative format.,4 "efficient computation adaptive artificial spiking neural networks. artificial neural networks (anns) bio-inspired models neural computation proven highly effective. still, anns lack natural notion time, neural units anns exchange analog values frame-based manner, computationally energetically inefficient form communication. contrasts sharply biological neurons communicate sparingly efficiently using binary spikes. artificial spiking neural networks (snns) constructed replacing units ann spiking neurons, current performance far deep anns hard benchmarks snns use much higher firing rates compared biological counterparts, limiting efficiency. show spiking neurons employ efficient form neural coding used construct snns match high-performance anns exceed state-of-the-art snns important benchmarks, requiring much lower average firing rates. this, use spike-time coding based firing rate limiting adaptation phenomenon observed biological spiking neurons. phenomenon captured adapting spiking neuron models, derive effective transfer function. neural units anns trained transfer function substituted directly adaptive spiking neurons, resulting adaptive snns (adsnns) carry inference deep neural networks using order magnitude fewer spikes compared previous snns. adaptive spike-time coding additionally allows dynamic control neural coding precision: show simple model arousal adsnns halves average required firing rate notion naturally extends forms attention. adsnns thus hold promise novel efficient model neural computation naturally fits temporally continuous asynchronous applications.",4 "deep convolutional neural network inverse problems imaging. paper, propose novel deep convolutional neural network (cnn)-based algorithm solving ill-posed inverse problems. regularized iterative algorithms emerged standard approach ill-posed inverse problems past decades. methods produce excellent results, challenging deploy practice due factors including high computational cost forward adjoint operators difficulty hyper parameter selection. starting point work observation unrolled iterative methods form cnn (filtering followed point-wise non-linearity) normal operator (h*h, adjoint h times h) forward model convolution. based observation, propose using direct inversion followed cnn solve normal-convolutional inverse problems. direct inversion encapsulates physical model system, leads artifacts problem ill-posed; cnn combines multiresolution decomposition residual learning order learn remove artifacts preserving image structure. demonstrate performance proposed network sparse-view reconstruction (down 50 views) parallel beam x-ray computed tomography synthetic phantoms well real experimental sinograms. proposed network outperforms total variation-regularized iterative reconstruction realistic phantoms requires less second reconstruct 512 x 512 image gpu.",4 "convolutional imputation matrix networks. matrix network family matrices, relationship modeled weighted graph. node represents matrix, weight edge represents similarity two matrices. suppose observe entries matrix noise, fraction entries observe varies matrix matrix. even worse, subset matrices family may completely unobserved. recover entire matrix network noisy incomplete observations? one motivating example cold start problem, need inference new users items come information. recover network matrices, propose structural assumption matrix network approximated generalized convolution low rank matrices living network. propose iterative imputation algorithm complete matrix network. algorithm efficient large scale applications guaranteed accurately recover matrices, long enough observations accumulated network.",4 "t-skirt: online estimation student proficiency adaptive learning system. develop t-skirt: temporal, structured-knowledge, irt-based method predicting student responses online. explicitly accounting student learning employing structured, multidimensional representation student proficiencies, model outperforms standard irt-based methods online response prediction task applied real responses collected students interacting diverse pools educational content.",4 "proportional conflict redistribution rules information fusion. paper propose five versions proportional conflict redistribution rule (pcr) information fusion together several examples. pcr1 pcr2, pcr3, pcr4, pcr5 one increases complexity rules also exactitude redistribution conflicting masses. pcr1 restricted hyper-power set power set without degenerate cases gives result weighted average operator (wao) proposed recently j{\o}sang, daniel vannoorenberghe satisfy neutrality property vacuous belief assignment. that's improved pcr rules proposed paper. pcr4 improvement minc dempster's rules. pcr rules redistribute conflicting mass, conjunctive rule applied, proportionally functions depending masses assigned corresponding columns mass matrix. infinitely many ways functions (weighting factors) chosen depending complexity one wants deal specific applications fusion systems. fusion combination rule degree ad-hoc.",4 "robustness causal claims. causal claim assertion invokes causal relationships variables, example drug certain effect preventing disease. causal claims established combination data set causal assumptions called causal model. claim robust insensitive violations causal assumptions embodied model. paper gives formal definition notion robustness establishes graphical condition quantifying degree robustness given causal claim. algorithms computing degree robustness also presented.",4 "bitwise operations cellular automaton gray-scale images. cellular automata (ca) theory discrete model represents state cells finite set possible values evolve time according pre-defined set transition rules. ca applied number image processing tasks convex hull detection, image denoising etc. mostly limitation restricting input binary images. general, gray-scale image may converted number different binary images finally recombined ca operations individually. developed multinomial regression based weighed summation method recombine binary images better performance ca based image processing algorithms. recombination algorithm tested specific case denoising salt pepper noise test standard benchmark algorithms median filter various images noise levels. results indicate several interesting invariances application ca, particular noise realization choice sub-sampling pixels determine recombination weights. additionally, appears simpler algorithms weight optimization seek local minima work effectively seek global minima simulated annealing.",4 "fully scalable online-preprocessing algorithm short oligonucleotide microarray atlases. accumulation standardized data collections opening novel opportunities holistic characterization genome function. limited scalability current preprocessing techniques has, however, formed bottleneck full utilization contemporary microarray collections. short oligonucleotide arrays constitute major source genome-wide profiling data, scalable probe-level preprocessing algorithms available measurement platforms based pre-calculated model parameters restricted reference training sets. overcome key limitations, introduce fully scalable online-learning algorithm provides tools process large microarray atlases including tens thousands arrays. unlike alternatives, proposed algorithm scales linear time respect sample size readily applicable short oligonucleotide platforms. available preprocessing algorithm learn probe-level parameters based sequential hyperparameter updates small, consecutive batches data, thus circumventing extensive memory requirements standard approaches opening novel opportunities take full advantage contemporary microarray data collections. moreover, using comprehensive data collections estimate probe-level effects assist pinpointing individual probes affected various biases provide new tools guide array design quality control. implementation freely available r/bioconductor http://www.bioconductor.org/packages/devel/bioc/html/rpa.html",16 "watch-bot: unsupervised learning reminding humans forgotten actions. present robotic system watches human using kinect v2 rgb-d sensor, detects forgot performing activity, necessary reminds person using laser pointer point related object. simple setup easily deployed assistive robot. approach based learning algorithm trained purely unsupervised setting, require human annotations. makes approach scalable applicable variant scenarios. model learns action/object co-occurrence action temporal relations activity, uses learned rich relationships infer forgotten action related object. show approach improves unsupervised action segmentation action cluster assignment performance, also effectively detects forgotten actions challenging human activity rgb-d video dataset. robotic experiments, show robot able remind people forgotten actions successfully.",4 "generalized least squares matrix decomposition. variables many massive high-dimensional data sets structured, arising example measurements regular grid imaging time series spatial-temporal measurements climate studies. classical multivariate techniques ignore structural relationships often resulting poor performance. propose generalization singular value decomposition (svd) principal components analysis (pca) appropriate massive data sets structured variables known two-way dependencies. finding best low rank approximation data respect transposable quadratic norm, decomposition, entitled generalized least squares matrix decomposition (gmd), directly accounts structural relationships. many variables high-dimensional settings often irrelevant noisy, also regularize matrix decomposition adding two-way penalties encourage sparsity smoothness. develop fast computational algorithms using methods perform generalized pca (gpca), sparse gpca, functional gpca massive data sets. simulations whole brain functional mri example demonstrate utility methodology dimension reduction, signal recovery, feature selection high-dimensional structured data.",19 "single-channel multi-talker speech recognition permutation invariant training. although great progresses made automatic speech recognition (asr), significant performance degradation still observed recognizing multi-talker mixed speech. paper, propose evaluate several architectures address problem assumption single channel mixed signal available. technique extends permutation invariant training (pit) introducing front-end feature separation module minimum mean square error (mse) criterion back-end recognition module minimum cross entropy (ce) criterion. specifically, training compute average mse ce whole utterance possible utterance-level output-target assignment, pick one minimum mse ce, optimize assignment. strategy elegantly solves label permutation problem observed deep learning based multi-talker mixed speech separation recognition systems. proposed architectures evaluated compared artificially mixed ami dataset two- three-talker mixed speech. experimental results indicate proposed architectures cut word error rate (wer) 45.0% 25.0% relatively state-of-the-art single-talker speech recognition system across speakers energies comparable, two- three-talker mixed speech, respectively. knowledge, first work multi-talker mixed speech recognition challenging speaker-independent spontaneous large vocabulary continuous speech task.",4 "loopy belief propagation approximate inference: empirical study. recently, researchers demonstrated loopy belief propagation - use pearls polytree algorithm bayesian network loops error- correcting codes.the dramatic instance near shannon - limit performance turbo codes codes whose decoding algorithm equivalent loopy belief propagation chain - structured bayesian network. paper ask : something special error - correcting code context, loopy propagation work approximate inference schemein general setting? compare marginals computed using loopy propagation exact ones four bayesian network architectures, including two real - world networks : alarm qmr.we find loopy beliefs often converge do, give good approximation correct marginals.however,on qmr network, loopy beliefs oscillated obvious relationship correct posteriors. present initial investigations cause oscillations, show simple methods preventing lead wrong results.",4 "driver action prediction using deep (bidirectional) recurrent neural network. advanced driver assistance systems (adas) significantly improved effective driver action prediction (dap). predicting driver actions early accurately help mitigate effects potentially unsafe driving behaviors avoid possible accidents. paper, formulate driver action prediction timeseries anomaly prediction problem. anomaly (driver actions interest) detection might trivial context, finding patterns consistently precede anomaly requires searching extracting features across multi-modal sensory inputs. present driver action prediction system, including real-time data acquisition, processing learning framework predicting future impending driver action. proposed system incorporates camera-based knowledge driving environment driver themselves, addition traditional vehicle dynamics. uses deep bidirectional recurrent neural network (dbrnn) learn correlation sensory inputs impending driver behavior achieving accurate high horizon action prediction. proposed system performs better existing systems driver action prediction tasks accurately predict key driver actions including acceleration, braking, lane change turning durations 5sec action executed driver.",19 "sentence entailment compositional distributional semantics. distributional semantic models provide vector representations words gathering co-occurrence frequencies corpora text. compositional distributional models extend representations words phrases sentences. categorical compositional distributional semantics representations built manner meanings phrases sentences functions grammatical structure meanings words therein. models applied reasoning phrase sentence level similarity. paper, argue prove models also used reason phrase sentence level entailment. provide preliminary experimental results toy entailment dataset.",4 "visual storytelling. introduce first dataset sequential vision-to-language, explore data may used task visual storytelling. first release dataset, sind v.1, includes 81,743 unique photos 20,211 sequences, aligned descriptive (caption) story language. establish several strong baselines storytelling task, motivate automatic metric benchmark progress. modelling concrete description well figurative social language, provided dataset storytelling task, potential move artificial intelligence basic understandings typical visual scenes towards human-like understanding grounded event structure subjective expression.",4 soft scheduling. classical notions disjunctive cumulative scheduling studied point view soft constraint satisfaction. soft disjunctive scheduling introduced instance soft csp preferences included problem applied generate lower bound based existing discrete capacity resource. timetabling problems purdue university faculty informatics masaryk university considering individual course requirements students demonstrate practical problems solved via proposed methods. implementation general preference constraint solver discussed first computational results timetabling problem presented.,4 "imitation networks: few-shot learning neural networks scratch. paper, propose imitation networks, simple effective method training neural networks limited amount training data. approach inherits idea knowledge distillation transfers knowledge deep wide reference model shallow narrow target model. proposed method employs idea mimic predictions reference estimators much robust overfitting network want train. different almost previous work knowledge distillation requires large amount labeled training data, proposed method requires small amount training data. instead, introduce pseudo training examples optimized part model parameters. experimental results several benchmark datasets demonstrate proposed method outperformed baselines, naive training target model standard knowledge distillation.",19 "adversarial examples fool detectors. adversarial example example adjusted produce wrong label presented system test time. date, adversarial example constructions demonstrated classifiers, detectors. adversarial examples could fool detector exist, could used (for example) maliciously create security hazards roads populated smart vehicles. paper, demonstrate construction successfully fools two standard detectors, faster rcnn yolo. existence examples surprising, attacking classifier different attacking detector, structure detectors - must search bounding box, cannot estimate box accurately - makes quite likely adversarial patterns strongly disrupted. show construction produces adversarial examples generalize well across sequences digitally, even though large perturbations needed. also show construction yields physical objects adversarial.",4 "model-based outdoor performance capture. propose new model-based method accurately reconstruct human performances captured outdoors multi-camera setup. starting template actor model, introduce new unified implicit representation both, articulated skeleton tracking nonrigid surface shape refinement. method fits template unsegmented video frames two stages - first, coarse skeletal pose estimated, subsequently non-rigid surface shape body pose jointly refined. particularly surface shape refinement propose new combination 3d gaussians designed align projected model likely silhouette contours without explicit segmentation edge detection. obtain reconstructions much higher quality outdoor settings existing methods, show par state-of-the-art methods indoor scenes designed",4 "bio-inspired collision detecotr small quadcopter. sense avoid capability enables insects fly versatilely robustly dynamic complex environment. biological principles practical efficient inspired human imitating flying machines. paper, studied novel bio-inspired collision detector application quadcopter. detector inspired lgmd neurons locusts, modeled stm32f407 mcu. compared collision detecting methods applied quadcopters, focused enhancing collision selectivity bio-inspired way considerably increase computing efficiency obstacle detecting task even complex dynamic environment. designed quadcopter's responding operation imminent collisions tested bio-inspired system indoor arena. observed results experiments demonstrated lgmd collision detector feasible work vision module quadcopter's collision avoidance task.",4 "design intelligent layer flexible querying databases. computer-based information technologies extensively used help many organizations, private companies, academic education institutions manage processes information systems hereby become nervous centre. explosion massive data sets created businesses, science governments necessitates intelligent powerful computing paradigms users benefit data. therefore new-generation database applications demand intelligent information management enhance efficient interactions database users. database systems support boolean query model. selection query sql database returns tuples satisfy conditions query.",4 "trajectory-based radical analysis network online handwritten chinese character recognition. recently, great progress made online handwritten chinese character recognition due emergence deep learning techniques. however, previous research mostly treated chinese character one class without explicitly considering inherent structure, namely radical components complicated geometry. study, propose novel trajectory-based radical analysis network (tran) firstly identify radicals analyze two-dimensional structures among radicals simultaneously, recognize chinese characters generating captions based analysis internal radicals. proposed tran employs recurrent neural networks (rnns) encoder decoder. rnn encoder makes full use online information directly transforming handwriting trajectory high-level features. rnn decoder aims generating caption detecting radicals spatial structures attention model. manner treating chinese character two-dimensional composition radicals reduce size vocabulary enable tran possess capability recognizing unseen chinese character classes, corresponding radicals seen. evaluated casia-olhwdb database, proposed approach significantly outperforms state-of-the-art whole-character modeling approach relative character error rate (cer) reduction 10%. meanwhile, case recognition 500 unseen chinese characters, tran achieve character accuracy 60% traditional whole-character method capability handle them.",4 "vinet: visual-inertial odometry sequence-to-sequence learning problem. paper present on-manifold sequence-to-sequence learning approach motion estimation using visual inertial sensors. best knowledge first end-to-end trainable method visual-inertial odometry performs fusion data intermediate feature-representation level. method numerous advantages traditional approaches. specifically, eliminates need tedious manual synchronization camera imu well eliminating need manual calibration imu camera. advantage model naturally elegantly incorporates domain specific information significantly mitigates drift. show approach competitive state-of-the-art traditional methods accurate calibration data available trained outperform presence calibration synchronization errors.",4 "electricity demand energy consumption management system. project describes electricity demand energy consumption management system application southern peru smelter. composed hourly demand-forecasting module simulation component plant electrical system. first module done using dynamic neural networks backpropagation training algorithm; used predict electric power demanded every hour, error percentage 1%. information allows efficient management energy peak demands happen, distributing raise electric load hours improving equipments increase demand. simulation module based advanced estimation techniques, as: parametric estimation, neural network modeling, statistic regression previously developed models, simulates electric behavior smelter plant. modules facilitate electricity demand consumption proper planning, allow knowing behavior hourly demand consumption patterns plant, including bill components, also energy deficiencies opportunities improvement, based analysis information equipments, processes production plans, well maintenance programs. finally results application southern peru smelter presented.",4 "optimal hybrid channel allocation:based machine learning algorithms. recent advances cellular communication systems resulted huge increase spectrum demand. meet requirements ever-growing need spectrum, efficient utilization existing resources utmost importance. channel allocation, thus become inevitable research topic wireless communications. paper, propose optimal channel allocation scheme, optimal hybrid channel allocation (ohca) effective allocation channels. improvise upon existing fixed channel allocation (fca) technique imparting intelligence existing system employing multilayer perceptron technique.",4 "compact, hierarchical q-function decomposition. previous work hierarchical reinforcement learning faced dilemma: either ignore values different possible exit states subroutine, thereby risking suboptimal behavior, represent values explicitly thereby incurring possibly large representation cost exit values refer nonlocal aspects world (i.e., subsequent rewards). paper shows that, many cases, one avoid problems. solution based recursively decomposing exit value function terms q-functions higher levels hierarchy. leads intuitively appealing runtime architecture parent subroutine passes child value function exit states child reasons choices affect exit value. also identify structural conditions value function transition distributions allow much concise representations exit state distributions, leading state abstraction. essence, variables whose exit values need considered parent cares child affects. demonstrate utility algorithms series increasingly complex environments.",4 "learning bayesian networks bnlearn r package. bnlearn r package includes several algorithms learning structure bayesian networks either discrete continuous variables. constraint-based score-based algorithms implemented, use functionality provided snow package improve performance via parallel computing. several network scores conditional independence algorithms available learning algorithms independent use. advanced plotting options provided rgraphviz package.",19 "geometric decomposition feed forward neural networks. several attempts mathematically understand neural networks many biological computational perspectives. field exploded last decade, yet neural networks still treated much like black box. work describe structure inherent feed forward neural network. provide framework future work neural networks improve training algorithms, compute homology network, applications. approach takes geometric point view unlike attempts mathematically understand neural networks rely functional perspective.",4 "combining semantic wikis controlled natural language. demonstrate acewiki semantic wiki using controlled natural language attempto controlled english (ace). goal enable easy creation modification ontologies web. texts ace automatically translated first-order logic languages, example owl. previous evaluation showed ordinary people able use acewiki without instructed.",4 "biomedical question answering via weighted neural network passage retrieval. amount publicly available biomedical literature growing rapidly recent years, yet question answering systems still struggle exploit full potential source data. preliminary processing step, many question answering systems rely retrieval models identifying relevant documents passages. paper proposes weighted cosine distance retrieval scheme based neural network word embeddings. experiments based publicly available data tasks bioasq biomedical question answering challenge demonstrate significant performance gains wide range state-of-the-art models.",4 "proceedings eighteenth conference uncertainty artificial intelligence (2002). proceedings eighteenth conference uncertainty artificial intelligence, held alberta, canada, august 1-4 2002",4 "evolutionary multiobjective optimization multi-location transshipment problem. consider multi-location inventory system inventory choices location centrally coordinated. lateral transshipments allowed recourse actions within echelon inventory system reduce costs improve service level. however, transshipment process usually causes undesirable lead times. paper, propose multiobjective model multi-location transshipment problem addresses optimizing three conflicting objectives: (1) minimizing aggregate expected cost, (2) maximizing expected fill rate, (3) minimizing expected transshipment lead times. apply evolutionary multiobjective optimization approach using strength pareto evolutionary algorithm (spea2), approximate optimal pareto front. simulation wide choice model parameters shows different trades-off conflicting objectives.",4 "dopamine modulation prefrontal delay activity-reverberatory activity sharpness tuning curves. recent electrophysiological experiments shown dopamine (d1) modulation pyramidal cells prefrontal cortex reduces spike frequency adaptation enhances nmda transmission. using four models, multicompartmental integrate fire, examine effects modulations sustained (delay) activity reverberatory network. find d1 modulation may enable robust network bistability yielding selective reverberation among cells code particular item location. show tuning curve cells sharpened, signal-to-noise ratio increased. postulate d1 modulation affects tuning ""memory fields"" yield efficient distributed dynamic representations.",16 "knowledge representation web revisited: tools prototype based ontologies. recent years rdf owl become common knowledge representation languages use web, propelled recommendation w3c. paper present practical implementation different kind knowledge representation based prototypes. detail, present concrete syntax easily effectively parsable applications. also present extensible implementations prototype knowledge base, specifically designed storage prototypes. implementations written java extended using implementation library. alternatively, software deployed such. further, results benchmarks local web deployment presented. paper augments research paper, describe theoretical aspects prototype system.",4 "real-time monocular object slam. present real-time object-based slam system leverages largest object database date. approach comprises two main components: 1) monocular slam algorithm exploits object rigidity constraints improve map find real scale, 2) novel object recognition algorithm based bags binary words, provides live detections database 500 3d objects. two components work together benefit other: slam algorithm accumulates information observations objects, anchors object features especial map landmarks sets constrains optimization. time, objects partially fully located within map used prior guide recognition algorithm, achieving higher recall. evaluate proposal five real environments showing improvements accuracy map efficiency respect state-of-the-art techniques.",4 "multiplicative drift analysis. work, introduce multiplicative drift analysis suitable way analyze runtime randomized search heuristics evolutionary algorithms. give multiplicative version classical drift theorem. allows easier analyses settings optimization progress roughly proportional current distance optimum. display strength tool, regard classical problem (1+1) evolutionary algorithm optimizes arbitrary linear pseudo-boolean function. here, first give relatively simple proof fact linear function optimized expected time $o(n \log n)$, $n$ length bit string. afterwards, show fact function optimized expected time ${(1+o(1)) 1.39 \euler n\ln (n)}$, using multiplicative drift analysis. also prove corresponding lower bound ${(1-o(1))e n\ln(n)}$ actually holds functions unique global optimum. demonstrate drift theorem immediately gives natural proofs (with better constants) best known runtime bounds (1+1) evolutionary algorithm combinatorial problems like finding minimum spanning trees, shortest paths, euler tours.",4 "filtrage vaste marge pour l'étiquetage séquentiel à noyaux de signaux. address paper problem multi-channel signal sequence labeling. particular, consider problem signals contaminated noise may present dephasing respect labels. that, propose jointly learn svm sample classifier temporal filtering channels. lead large margin filtering adapted specificity channel (noise time-lag). derive algorithms solve optimization problem discuss different filter regularizations automated scaling selection channels. approach tested non-linear toy example bci dataset. results show classification performance problems improved learning large margin filtering.",4 "improve evaluation fluency using entropy machine translation evaluation metrics. widely-used automatic evaluation metrics cannot adequately reflect fluency translations. n-gram-based metrics, like bleu, limit maximum length matched fragments n cannot catch matched fragments longer n, reflect fluency indirectly. meteor, limited n-gram, uses number matched chunks consider length chunk. paper, propose entropy-based method, sufficiently reflect fluency translations distribution matched words. method easily combine widely-used automatic evaluation metrics improve evaluation fluency. experiments show correlations bleu meteor improved sentence level combining entropy-based method wmt 2010 wmt 2012.",4 "design, implementation simulation cloud computing system enhancing real-time video services using vanet onboard navigation systems. paper, propose design novel experimental cloud computing systems. proposed system aims enhancing computational, communicational annalistic capabilities road navigation services merging several independent technologies, namely vision-based embedded navigation systems, prominent cloud computing systems (ccss) vehicular ad-hoc network (vanet). work presents initial investigations describing design global generic system. designed system experimented various scenarios video-based road services. moreover, associated architecture implemented small-scale simulator in-vehicle embedded system. implemented architecture experimented case simulated road service aid police agency. goal service recognize track searched individuals vehicles real-time monitoring system remotely connected moving cars. presented work demonstrates potential system efficiently enhancing diversifying real-time video services road environments.",4 "fast parallel svm using data augmentation. one popular classifiers, linear svms still challenges dealing large-scale problems, even though linear sub-linear algorithms developed recently single machines. parallel computing methods developed learning large-scale svms. however, existing methods rely solving local sub-optimization problems. paper, develop novel parallel algorithm learning large-scale linear svm. approach based data augmentation equivalent formulation, casts problem learning svm bayesian inference problem, develop efficient parallel sampling methods. provide empirical results parallel sampling svm, provide extensions svr, non-linear kernels, provide parallel implementation crammer singer model. approach promising right, useful technique parallelize broader family general maximum-margin models.",4 "3d textured model encryption via 3d lu chaotic mapping. coming virtual/augmented reality (vr/ar) era, 3d contents popularized images videos today. security privacy 3d contents taken consideration. 3d contents contain surface models solid models. surface models include point clouds, meshes textured models. previous work mainly focus encryption solid models, point clouds meshes. work focuses complicated 3d textured model. propose 3d lu chaotic mapping based encryption method 3d textured model. encrypt vertexes, polygons textures 3d models separately using 3d lu chaotic mapping. encrypted vertices, edges texture maps composited together form final encrypted 3d textured model. experimental results reveal method encrypt decrypt 3d textured models correctly. addition, method resistant several attacks brute-force attack statistic attack.",4 "trolling hierarchy social media conditional random field trolling detection. an-ever increasing number social media websites, electronic newspapers internet forums allow visitors leave comments others read interact. exchange free participants malicious intentions, contribute written conversation. among different communities users adopt strategies handle users. paper present comprehensive categorization trolling phenomena resource, inspired politeness research propose model jointly predicts four crucial aspects trolling: intention, interpretation, intention disclosure response strategy. finally, present new annotated dataset containing excerpts conversations involving trolls interactions users hope useful resource research community.",4 "heads better one: training diverse ensemble deep networks. convolutional neural networks achieved state-of-the-art performance wide range tasks. benchmarks led ensembles powerful learners, ensembling typically treated post-hoc procedure implemented averaging independently trained models model variation induced bagging random initialization. paper, rigorously treat ensembling first-class problem explicitly address question: best strategies create ensemble? first compare large number ensembling strategies, propose evaluate novel strategies, parameter sharing (through new family models call treenets) well training ensemble-aware diversity-encouraging losses. demonstrate treenets improve ensemble performance diverse ensembles trained end-to-end unified loss, achieving significantly higher ""oracle"" accuracies classical ensembles.",4 "challenge ieee-isbi/tcb : application covariance matrices wavelet marginals. short memo aims explaining approach challenge ieee-isbi bone texture characterization. work, focus use covariance matrices wavelet marginals svm classifier.",4 spatiotemporal articulated models dynamic slam. propose online spatiotemporal articulation model estimation framework estimates articulated structure well temporal prediction model solely using passive observations. resulting model predict future mo- tions articulated object high confidence spatial temporal structure. demonstrate effectiveness predictive model incorporating within standard simultaneous localization mapping (slam) pipeline mapping robot localization previously unexplored dynamic environments. method able localize robot map dynamic scene explaining observed motion world. demonstrate effectiveness proposed framework simulated real-world dynamic environments.,4 "computer simulation based parameter selection resistance exercise. contrast scientific disciplines, sports science research characterized comparatively little effort investment development relevant phenomenological models. scarcer yet application said models practice. present framework allows resistance training practitioners employ recently proposed neuromuscular model actual training program design. first novelty concerns monitoring aspect coaching. method extracting training performance characteristics loosely constrained video sequences, effortlessly minimal human input, using computer vision described. extracted data subsequently used fit underlying neuromuscular model. achieved solving inverse dynamics problem corresponding particular exercise. lastly, computer simulation hypothetical training bouts, using athlete-specific capability parameters, used predict effected adaptation changes performance. software described allows practitioner manipulate hypothetical training parameters immediately see effect predicted adaptation specific athlete. thus, work presents holistic view monitoring-assessment-adjustment loop.",4 "fast convergence regularized learning games. show natural classes regularized learning algorithms form recency bias achieve faster convergence rates approximate efficiency coarse correlated equilibria multiplayer normal form games. player game uses algorithm class, individual regret decays $o(t^{-3/4})$, sum utilities converges approximate optimum $o(t^{-1})$--an improvement upon worst case $o(t^{-1/2})$ rates. show black-box reduction algorithm class achieve $\tilde{o}(t^{-1/2})$ rates adversary, maintaining faster rates algorithms class. results extend [rakhlin shridharan 2013] [daskalakis et al. 2014], analyzed two-player zero-sum games specific algorithms.",4 "fusing video inertial sensor data walking person identification. autonomous computer system (such robot) typically needs identify, locate, track persons appearing sight. however, solutions limitations regarding efficiency, practicability, environmental constraints. paper, propose effective practical system combines video inertial sensors person identification (pid). persons different activities easy identify. show robustness potential system, propose walking person identification (wpid) method identify persons walking time. comparing features derived video inertial sensor data, associate sensors smartphones human objects videos. results show correctly identified rate wpid method 76% 2 seconds.",4 "machine learning based data mining milky way filamentary structures reconstruction. present innovative method called filexsec (filaments extraction, selection classification), data mining tool developed investigate possibility refine optimize shape reconstruction filamentary structures detected consolidated method based flux derivative analysis, column-density maps computed herschel infrared galactic plane survey (hi-gal) observations galactic plane. present methodology based feature extraction module followed machine learning model (random forest) dedicated select features classify pixels input images. tests simulations real observations method appears reliable robust respect variability shape distribution filaments. cases highly defined filament structures, presented method able bridge gaps among detected fragments, thus improving shape reconstruction. preliminary ""a posteriori"" analysis derived filament physical parameters, method appears potentially able add sufficient contribution complete refine filament reconstruction.",1 "two-point method ptz camera calibration sports. calibrating narrow field view soccer cameras challenging field markings image. unlike previous solutions, propose two-point method, requires two point correspondences given prior knowledge base location orientation pan-tilt-zoom (ptz) camera. deploy new calibration method annotate pan-tilt-zoom data soccer videos. collected data used references new images. also propose fast random forest method predict pan-tilt angles without image-to-image feature matching, leading efficient calibration method new images. demonstrate system synthetic data two real soccer datasets. two-point approach achieves superior performance state-of-the-art method.",4 "knowledge mining model ranking institutions using rough computing ordering rules formal concept analysis. emergences computers information technological revolution made tremendous changes real world provides different dimension intelligent data analysis. well formed fact, information right time right place deploy better knowledge.however, challenge arises larger volume inconsistent data given decision making knowledge extraction. handle imprecise data certain mathematical tools greater importance developed researches recent past namely fuzzy set, intuitionistic fuzzy set, rough set, formal concept analysis ordering rules. also observed many information system contains numerical attribute values therefore almost similar instead exact similar. handle type information system, paper use two processes pre process post process. pre process use rough set intuitionistic fuzzy approximation space ordering rules finding knowledge whereas post process use formal concept analysis explore better knowledge vital factors affecting decisions.",4 "fsmj: feature selection maximum jensen-shannon divergence text categorization. paper, present new wrapper feature selection approach based jensen-shannon (js) divergence, termed feature selection maximum js-divergence (fsmj), text categorization. unlike existing feature selection approaches, proposed fsmj approach based real-valued features provide information discrimination binary-valued features used conventional approaches. show fsmj greedy approach js-divergence monotonically increases features selected. conduct several experiments real-life data sets, compared state-of-the-art feature selection approaches text categorization. superior performance proposed fsmj approach demonstrates effectiveness indicates wide potential applications data mining.",19 "real-time scheduling via reinforcement learning. cyber-physical systems, mobile robots, must respond adaptively dynamic operating conditions. effective operation systems requires sensing actuation tasks performed timely manner. additionally, execution mission specific tasks imaging room must balanced need perform general tasks obstacle avoidance. problem addressed maintaining relative utilization shared resources among tasks near user-specified target level. producing optimal scheduling strategies requires complete prior knowledge task behavior, unlikely available practice. instead, suitable scheduling strategies must learned online interaction system. consider sample complexity reinforcement learning domain, demonstrate problem state space countably infinite, may leverage problem's structure guarantee efficient learning.",4 "learning sight sound: ambient sound provides supervision visual learning. sound crashing waves, roar fast-moving cars -- sound conveys important information objects surroundings. work, show ambient sounds used supervisory signal learning visual models. demonstrate this, train convolutional neural network predict statistical summary sound associated video frame. show that, process, network learns representation conveys information objects scenes. evaluate representation several recognition tasks, finding performance comparable state-of-the-art unsupervised learning methods. finally, show visualizations network learns units selective objects often associated characteristic sounds. paper extends earlier conference paper, owens et al. 2016, additional experiments discussion.",4 "deep structured features semantic segmentation. propose highly structured neural network architecture semantic segmentation extremely small model size, suitable low-power embedded mobile platforms. specifically, architecture combines i) haar wavelet-based tree-like convolutional neural network (cnn), ii) random layer realizing radial basis function kernel approximation, iii) linear classifier. stages i) ii) completely pre-specified, linear classifier learned data. apply proposed architecture outdoor scene aerial image semantic segmentation show accuracy architecture competitive conventional pixel classification cnns. furthermore, demonstrate proposed architecture data efficient sense matching accuracy pixel classification cnns trained much smaller data set.",4 "mobile big data analytics using deep learning apache spark. proliferation mobile devices, smartphones internet things (iot) gadgets, results recent mobile big data (mbd) era. collecting mbd unprofitable unless suitable analytics learning methods utilized extracting meaningful information hidden patterns data. article presents overview brief tutorial deep learning mbd analytics discusses scalable learning framework apache spark. specifically, distributed deep learning executed iterative mapreduce computing many spark workers. spark worker learns partial deep model partition overall mbd, master deep model built averaging parameters partial models. spark-based framework speeds learning deep models consisting many hidden layers millions parameters. use context-aware activity recognition application real-world dataset containing millions samples validate framework assess speedup effectiveness.",4 "improving accuracy cognilearn system cognitive behavior assessment. htks game-like cognitive assessment method, designed children four eight years age. htks assessment, child responds sequence requests, ""touch head"" ""touch toes"". cognitive challenge stems fact children instructed interpret requests literally, touching different body part one stated. prior work, developed cognilearn system, captures data subjects performing htks game, analyzes motion subjects. paper propose specific improvements make motion analysis module accurate. result improvements, accuracy recognizing cases subjects touch toes gone 76.46% previous work 97.19% paper.",4 "partial reinitialisation optimisers. heuristic optimisers search optimal configuration variables relative objective function often get stuck local optima algorithm unable find improvement. standard approach circumvent problem involves periodically restarting algorithm random initial configurations improvement found. propose method partial reinitialization, whereby, attempt find better solution, sub-sets variables re-initialised rather whole configuration. much information gained previous runs hence retained. leads significant improvements quality solution found given time variety optimisation problems machine learning.",19 "sentibubbles: topic modeling sentiment visualization entity-centric tweets. social media users tend mention entities reacting news events. main purpose work create entity-centric aggregations tweets daily basis. applying topic modeling sentiment analysis, create data visualization insights current events people reactions events entity-centric perspective.",4 "learning independent features adversarial nets non-linear ica. reliable measures statistical dependence could useful tools learning independent features performing tasks like source separation using independent component analysis (ica). unfortunately, many measures, like mutual information, hard estimate optimize directly. propose learn independent features adversarial objectives optimize measures implicitly. objectives compare samples joint distribution product marginals without need compute probability densities. also propose two methods obtaining samples product marginals using either simple resampling trick separate parametric distribution. experiments show strategy easily applied different types model architectures solve linear non-linear ica problems.",19 "multi-modal aggregation video classification. paper, present solution large-scale video classification challenge (lsvc2017) [1] ranked 1st place. focused variety modalities cover visual, motion audio. also, visualized aggregation process better understand modality takes effect. among extracted modalities, found temporal-spatial features calculated 3d convolution quite promising greatly improved performance. attained official metric map 0.8741 testing set ensemble model.",4 "learning aggregated transmission propagation networks haze removal beyond. single image dehazing important low-level vision task many applications. early researches investigated different kinds visual priors address problem. however, may fail assumptions valid specific images. recent deep networks also achieve relatively good performance task. unfortunately, due disappreciation rich physical rules hazes, large amounts data required training. importantly, may still fail exist completely different haze distributions testing images. considering collaborations two perspectives, paper designs novel residual architecture aggregate prior (i.e., domain knowledge) data (i.e., haze distribution) information propagate transmissions scene radiance estimation. present variational energy based perspective investigate intrinsic propagation behavior aggregated deep model. way, actually bridge gap prior driven models data driven networks leverage advantages avoid limitations previous dehazing approaches. lightweight learning framework proposed train propagation network. finally, introducing taskaware image decomposition formulation flexible optimization scheme, extend proposed model challenging vision tasks, underwater image enhancement single image rain removal. experiments synthetic realworld images demonstrate effectiveness efficiency proposed framework.",4 "wavelet-based semantic features hyperspectral signature discrimination. hyperspectral signature classification quantitative analysis approach hyperspectral imagery performs detection classification constituent materials pixel level scene. classification procedure operated directly hyperspectral data performed using features extracted corresponding hyperspectral signatures containing information like signature's energy shape. paper, describe technique applies non-homogeneous hidden markov chain (nhmc) models hyperspectral signature classification. basic idea use statistical models (such nhmc) characterize wavelet coefficients capture spectrum semantics (i.e., structural information) multiple levels. experimental results show approach based nhmc models outperform existing approaches relevant classification tasks.",4 "diverse accurate image description using variational auto-encoder additive gaussian encoding space. paper explores image caption generation using conditional variational auto-encoders (cvaes). standard cvaes fixed gaussian prior yield descriptions little variability. instead, propose two models explicitly structure latent space around $k$ components corresponding different types image content, combine components create priors images contain multiple types content simultaneously (e.g., several kinds objects). first model uses gaussian mixture model (gmm) prior, second one defines novel additive gaussian (ag) prior linearly combines component means. show models produce captions diverse accurate strong lstm baseline ""vanilla"" cvae fixed gaussian prior, ag-cvae showing particular promise.",4 "hyperbolic representation learning fast efficient neural question answering. dominant neural architectures question answer retrieval based recurrent convolutional encoders configured complex word matching layers. given recent architectural innovations mostly new word interaction layers attention-based matching mechanisms, seems well-established fact components mandatory good performance. unfortunately, memory computation cost incurred complex mechanisms undesirable practical applications. such, paper tackles question whether possible achieve competitive performance simple neural architectures. propose simple novel deep learning architecture fast efficient question-answer ranking retrieval. specifically, proposed model, \textsc{hyperqa}, parameter efficient neural network outperforms parameter intensive models attentive pooling bilstms multi-perspective cnns multiple qa benchmarks. novelty behind \textsc{hyperqa} pairwise ranking objective models relationship question answer embeddings hyperbolic space instead euclidean space. empowers model self-organizing ability enables automatic discovery latent hierarchies learning embeddings questions answers. model requires feature engineering, similarity matrix matching, complicated attention mechanisms over-parameterized layers yet outperforms remains competitive many models functionalities multiple benchmarks.",4 "framework culture-aware robots based fuzzy logic. cultural adaptation, i.e., matching robot's behaviours cultural norms preferences user, well known key requirement success assistive application. however, culture-dependent robot behaviours often implicitly set designers, thus allowing easy automatic adaptation different cultures. paper presents method design culture-aware robots, automatically adapt behaviour conform given culture. propose mapping cultural factors related parameters robot behaviours relies linguistic variables encode heterogeneous cultural factors uniform formalism, fuzzy rules encode qualitative relations among multiple variables. illustrate approach two practical case studies.",4 "image resolution enhancement using interpolation followed iterative back projection. paper, propose new super resolution technique based interpolation followed registering using iterative back projection (ibp). low resolution images interpolated interpolated images registered order generate sharper high resolution image. proposed technique tested lena, elaine, pepper, baboon. quantitative peak signal-to-noise ratio (psnr) structural similarity index (ssim) results well visual results show superiority proposed technique conventional state-of-art image super resolution techniques. lena's image, psnr 6.52 db higher bicubic interpolation.",4 "heuristic optimization automated distribution system planning network integration studies. network integration studies try assess impact future developments, increase renewable energy sources introduction smart grid technologies, large-scale network areas. goals support strategic alignment regulatory framework adapt network planning principles distribution system operators. study outlines approach automated distribution system planning calculate network reconfiguration, reinforcement extension plans fully automated fashion. allows estimation expected cost massive probabilistic simulations large numbers real networks constitutes core component framework large-scale network integration studies. exemplary case study results presented performed cooperation different major distribution system operators. case studies cover estimation expected network reinforcement costs, technical economical assessment smart grid technologies structural network optimisation.",4 "planning acting uncertainty: new model spoken dialogue systems. uncertainty plays central role spoken dialogue systems. stochastic models like markov decision process (mdp) used model dialogue manager. partially observable system state user intention hinder natural representation dialogue state. mdp-based system degrades fast uncertainty user's intention increases. propose novel dialogue model based partially observable markov decision process (pomdp). use hidden system states user intentions state set, parser results low-level information observation set, domain actions dialogue repair actions action set. low-level information extracted different input modals, including speech, keyboard, mouse, etc., using bayesian networks. limitation exact algorithms, focus heuristic approximation algorithms applicability pomdp dialogue management. also propose two methods grid point selection grid-based approximation algorithms.",4 "deep learning approach drone monitoring. drone monitoring system integrates deep-learning-based detection tracking modules proposed work. biggest challenge adopting deep learning methods drone detection limited amount training drone images. address issue, develop model-based drone augmentation technique automatically generates drone images bounding box label drone's location. track small flying drone, utilize residual information consecutive image frames. finally, present integrated detection tracking system outperforms performance individual module containing detection tracking only. experiments show that, even trained synthetic data, proposed system performs well real world drone images complex background. usc drone detection tracking dataset user labeled bounding boxes available public.",4 "r-phoc: segmentation-free word spotting using cnn. paper proposes region based convolutional neural network segmentation-free word spotting. net- work takes input image set word candidate bound- ing boxes embeds bounding boxes embedding space, word spotting casted simple nearest neighbour search query representation candidate bounding boxes. make use phoc embedding previously achieved significant success segmentation- based word spotting. word candidates generated using simple procedure based grouping connected components using spatial constraints. experiments show r-phoc operates images directly improve current state-of- the-art standard gw dataset performs good phocnet cases designed segmentation based word spotting.",4 "cluster coloring self-organizing map: information visualization perspective. paper takes information visualization perspective visual representations general som paradigm. involves viewing som-based visualizations eyes bertin's tufte's theories data graphics. regular grid shape self-organizing map (som), virtue linking visualizations it, restricts representation cluster structures. viewpoint information visualization, paper provides general, yet simple, solution projection-based coloring som reveals structures. first, proposed color space easy construct customize purpose use, aiming perceptually correct informative two separable dimensions. second, coloring method dependent specific method projection, rather modular fit objective function suitable task hand. cluster coloring illustrated two datasets: iris data, welfare poverty indicators.",4 "virtual adversarial ladder networks semi-supervised learning. semi-supervised learning (ssl) partially circumvents high cost labeling data augmenting small labeled dataset large relatively cheap unlabeled dataset drawn distribution. paper offers novel interpretation two deep learning-based ssl approaches, ladder networks virtual adversarial training (vat), applying distributional smoothing respective latent spaces. propose class models fuse approaches. achieve near-supervised accuracy high consistency mnist dataset using 5 labels per class: best model, ladder layer-wise virtual adversarial noise (lvan-lw), achieves 1.42% +/- 0.12 average error rate mnist test set, comparison 1.62% +/- 0.65 reported ladder network. adversarial examples generated l2-normalized fast gradient method, lvan-lw trained 5 examples per class achieves average error rate 2.4% +/- 0.3 compared 68.6% +/- 6.5 ladder network 9.9% +/- 7.5 vat.",4 "whitehead method genetic algorithms. paper discuss genetic version (gwa) whitehead's algorithm, one basic algorithms combinatorial group theory. turns gwa surprisingly fast outperforms standard whitehead's algorithm free groups rank >= 5. experimenting gwa collected interesting numerical data clarifies time-complexity whitehead's problem general. experiments led us several mathematical conjectures. confirmed shed light hidden mechanisms whitehead method geometry automorphic orbits free groups.",12 "improve lexicon-based word embeddings word sense disambiguation. works learn lexicon together corpus improve word embeddings. however, either model lexicon separately update neural networks corpus lexicon likelihood, minimize distance synonym pairs lexicon. methods consider relatedness difference corpus lexicon, may best optimized. paper, propose novel method considers relatedness difference corpus lexicon. trains word embeddings learning corpus predicate word corresponding synonym context time. polysemous words, use word sense disambiguation filter eliminate synonyms different meanings context. evaluate proposed method, compare performance word embeddings trained proposed model, control groups without filter lexicon, prior works word similarity tasks text classification task. experimental results show proposed model provides better embeddings polysemous words improves performance text classification.",4 "threshold disorder source diverse complex behavior random nets. study diversity complex spatio-temporal patterns behavior random synchronous asymmetric neural networks (rsanns). special attention given impact disordered threshold values limit-cycle diversity limit-cycle complexity rsanns `normal' thresholds default. surprisingly, rsanns exhibit small repertoire rather complex limit-cycle patterns parameters fixed. repertoire complex patterns also rather stable respect small parameter changes. two unexpected results may generalize study complex systems. order reach beyond seemingly-disabling `stable small' aspect limit-cycle repertoire rsanns, found rsann threshold disorder critical level, rapid increase size repertoire patterns. repertoire size initially follows power-law function magnitude threshold disorder. disorder increases further, limit-cycle patterns become simpler second critical level limit cycles become simple fixed points. nonetheless, moderate changes threshold parameters, rsanns found display specific features behavior desired rapidly-responding processing systems: accessibility large set complex patterns.",3 "alignedreid: surpassing human-level performance person re-identification. paper, propose novel method called alignedreid extracts global feature jointly learned local features. global feature learning benefits greatly local feature learning, performs alignment/matching calculating shortest path two sets local features, without requiring extra supervision. joint learning, keep global feature compute similarities images. method achieves rank-1 accuracy 94.4% market1501 97.8% cuhk03, outperforming state-of-the-art methods large margin. also evaluate human-level performance demonstrate method first surpass human-level performance market1501 cuhk03, two widely used person reid datasets.",4 "transfer learning ocropus model training early printed books. method presented significantly reduces character error rates ocr text obtained ocropus models trained early printed books small amounts diplomatic transcriptions available. achieved building already existing models training instead starting scratch. overcome discrepancies set characters pretrained model additional ground truth ocropus code adapted allow alphabet expansion reduction. character set capable flexibly adding deleting characters pretrained alphabet existing model loaded. experiments use self-trained mixed model early latin prints two standard ocropus models modern english german fraktur texts. evaluation seven early printed books showed training latin mixed model reduces average amount errors 43% 26%, respectively compared training scratch 60 150 lines ground truth, respectively. furthermore, shown even building mixed models trained data unrelated newly added training test data lead significantly improved recognition results.",4 "multiple context-free tree grammars: lexicalization characterization. multiple (simple) context-free tree grammars investigated, ""simple"" means ""linear nondeleting"". every multiple context-free tree grammar finitely ambiguous lexicalized; i.e., transformed equivalent one (generating tree language) rule grammar contains lexical symbol. due transformation, rank nonterminals increases 1, multiplicity (or fan-out) grammar increases maximal rank lexical symbols; particular, multiplicity increase lexical symbols rank 0. multiple context-free tree grammars tree generating power multi-component tree adjoining grammars (provided latter use root-marker). moreover, every multi-component tree adjoining grammar finitely ambiguous lexicalized. multiple context-free tree grammars string generating power multiple context-free (string) grammars polynomial time parsing algorithms. tree language generated multiple context-free tree grammar image regular tree language deterministic finite-copying macro tree transducer. multiple context-free tree grammars used synchronous translation device.",4 "utilizing semantic visual landmarks precise vehicle navigation. paper presents new approach integrating semantic information vision-based vehicle navigation. although vision-based vehicle navigation systems using pre-mapped visual landmarks capable achieving submeter level accuracy large-scale urban environment, typical error source type systems comes presence visual landmarks features temporal objects environment, cars pedestrians. propose gated factor graph framework use semantic information associated visual features make decisions outlier/ inlier computation three perspectives: feature tracking process, geo-referenced map building process, navigation system using pre-mapped landmarks. class category visual feature belongs extracted pre-trained deep learning network trained semantic segmentation. feasibility generality approach demonstrated implementations top two vision-based navigation systems. experimental evaluations validate injection semantic information associated visual landmarks using approach achieves substantial improvements accuracy gps-denied navigation solutions large-scale urban scenarios",4 "comparative study complexity handwritten bharati characters major indian scripts. present bharati, simple, novel script represent characters majority contemporary indian scripts. shapes/motifs bharati characters drawn simplest characters existing indian scripts. bharati characters designed strictly reflect underlying phonetic organization, thereby attributing script qualities simplicity, familiarity, ease acquisition use. thus, employing bharati script common script majority indian languages ameliorate several existing communication bottlenecks india. perform complexity analysis handwritten bharati script compare complexity 9 major indian scripts. measures complexity derived theory handwritten characters based catastrophe theory. bharati script shown simpler 9 major indian scripts measures complexity.",4 "introduction conll-2002 shared task: language-independent named entity recognition. describe conll-2002 shared task: language-independent named entity recognition. give background information data sets evaluation method, present general overview systems taken part task discuss performance.",4 "dimensionapp : android app estimate object dimensions. project, develop android app uses computer vision techniques estimate object dimension present field view. app compact size, accurate upto +/- 5 mm robust towards touch inputs. use single-view metrology compute accurate measurement. unlike previous approaches, technique rely line detection generalize object shape easily.",4 "feature importance scores lossless feature pruning using banzhaf power indices. understanding influence features machine learning crucial interpreting models selecting best features classification. work propose use principles coalitional game theory reason importance features. particular, propose use banzhaf power index measure influence features outcome classifier. show features banzhaf power index zero losslessly pruned without damage classifier accuracy. computing power indices require access data samples. however, samples available, indices empirically estimated. compute banzhaf power indices neural network classifier real-life data, compare results gradient-based feature saliency, coefficients logistic regression model $l_1$ regularization.",19 deciding hmm parameters based number critical points gesture recognition motion capture data. paper presents method choosing number states hmm based number critical points motion capture data. choice hidden markov models(hmm) parameters crucial recognizer's performance first step training cannot corrected automatically within hmm. article define predictor number states based number critical points sequence test effectiveness sample data.,4 "using pivot consistency decompose solve functional csps. many studies carried order increase search efficiency constraint satisfaction problems; among them, make use structural properties constraint network; others take account semantic properties constraints, generally assuming constraints possess given property. paper, propose new decomposition method benefiting semantic properties functional constraints (not bijective constraints) structural properties network; furthermore, constraints need functional. show conditions, existence solutions guaranteed. first characterize particular subset variables, name root set. introduce pivot consistency, new local consistency weak form path consistency achieved o(n^2d^2) complexity (instead o(n^3d^3) path consistency), present associated properties; particular, show consistent instantiation root set linearly extended solution, leads presentation aforementioned new method solving decomposing functional csps.",4 "arrhythmia detection using mutual information-based integration method. aim paper propose application mutual information-based ensemble methods analysis classification heart beats associated different types arrhythmia. models multilayer perceptrons, support vector machines, radial basis function neural networks trained tested using mit-bih arrhythmia database. research brings focus ensemble method that, knowledge, novel application area ecg arrhythmia detection. proposed classifier ensemble method showed improved performance, relative either majority voting classifier integration individual classifier performance. overall ensemble accuracy 98.25%.",4 "survey & experiment: towards learning accuracy. attain best learning accuracy, people move difficulties frustrations. though one optimize empirical objective using given set samples, generalization ability entire sample distribution remains questionable. even fair generalization guarantee offered, one still wants know happen regularizer removed, and/or well artificial loss (like hinge loss) relates accuracy. reason, report surveys four different trials towards learning accuracy, embracing major advances supervised learning theory past four years. starting generic setting learning, first two trials introduce best optimization generalization bounds convex learning, third trial gets rid regularizer. innovative attempt, fourth trial studies optimization objective exactly accuracy, special case binary classification. report also analyzes last trial experiments.",4 "stabilized sparse online learning sparse data. stochastic gradient descent (sgd) commonly used optimization large-scale machine learning problems. langford et al. (2009) introduce sparse online learning method induce sparsity via truncated gradient. high-dimensional sparse data, however, method suffers slow convergence high variance due heterogeneity feature sparsity. mitigate issue, introduce stabilized truncated stochastic gradient descent algorithm. employ soft-thresholding scheme weight vector imposed shrinkage adaptive amount information available feature. variability resulted sparse weight vector controlled stability selection integrated informative truncation. facilitate better convergence, adopt annealing strategy truncation rate, leads balanced trade-off exploration exploitation learning sparse weight vector. numerical experiments show algorithm compares favorably original algorithm terms prediction accuracy, achieved sparsity stability.",19 "combining probabilistic sampling technique simple heuristics solve dynamic path planning problem. probabilistic sampling methods become popular solve single-shot path planning problems. rapidly-exploring random trees (rrts) particular shown efficient solving high dimensional problems. even though several rrt variants proposed tackle dynamic replanning problem, methods perform well environments infrequent changes. paper addresses dynamic path planning problem combining simple techniques multi-stage probabilistic algorithm. algorithm uses rrts initial solution, informed local search fix unfeasible paths simple greedy optimizer. algorithm capable recognizing local search stuck, subsequently restart rrt. show combination simple techniques provides better responses highly dynamic environment dynamic rrt variants.",4 "customized nonlinear bandits online response selection neural conversation models. dialog response selection important step towards natural response generation conversational agents. existing work neural conversational models mainly focuses offline supervised learning using large set context-response pairs. paper, focus online learning response selection retrieval-based dialog systems. propose contextual multi-armed bandit model nonlinear reward function uses distributed representation text online response selection. bidirectional lstm used produce distributed representations dialog context responses, serve input contextual bandit. learning bandit, propose customized thompson sampling method applied polynomial feature space approximating reward. experimental results ubuntu dialogue corpus demonstrate significant performance gains proposed method conventional linear contextual bandits. moreover, report encouraging response selection performance proposed neural bandit model using recall@k metric small set online training samples.",4 "distributional measures semantic distance: survey. ability mimic human notions semantic distance widespread applications. measures rely raw text (distributional measures) rely knowledge sources wordnet. although extensive studies performed compare wordnet-based measures human judgment, use distributional measures proxies estimate semantic distance received little attention. even though traditionally performed poorly compared wordnet-based measures, lay claim certain uniquely attractive features, applicability resource-poor languages ability mimic semantic similarity semantic relatedness. therefore, paper presents detailed study distributional measures. particular attention paid flesh strengths limitations wordnet-based distributional measures, distributional measures distance brought line human notions semantic distance. conclude brief discussion recent work hybrid measures.",4 "mapping mutable genres structurally complex volumes. mine large digital libraries humanistically meaningful ways, scholars need divide genre. task classification algorithms well suited assist, need adjustment address specific challenges domain. digital libraries pose two problems scale usually found article datasets used test algorithms. 1) libraries span several centuries, genres identified may change gradually across time axis. 2) volumes much longer articles, tend internally heterogeneous, classification task needs begin segmentation. describe multi-layered solution trains hidden markov models segment volumes, uses ensembles overlapping classifiers address historical change. test approach collection 469,200 volumes drawn hathitrust digital library. demonstrate humanistic value methods, extract 32,209 volumes fiction digital library, trace changing proportions first- third-person narration corpus. note narrative points view seem strong associations particular themes genres.",4 "random matrix theoretical approach early event detection smart grid. power systems developing fast nowadays, size complexity; situation challenge early event detection (eed). paper proposes data- driven unsupervised learning method handle challenge. specifically, random matrix theories (rmts) introduced statistical foundations random matrix models (rmms); based rmms, linear eigenvalue statistics (less) defined via test functions system indicators. comparing values les experimental theoretical ones, anomaly detection conducted. furthermore, develop 3d power-map visualize les; provides robust auxiliary decision-making mechanism operators. sense, proposed method conducts eed pure statistical procedure, requiring knowledge system topologies, unit operation/control models, etc. les, key ingredient procedure, high dimensional indictor derived directly raw data. unsupervised learning indicator, les much sensitive low dimensional indictors obtained supervised learning. statistical procedure, proposed method universal fast; moreover, robust traditional eed challenges (such error accumulations, spurious correlations, even bad data core area). case studies, simulated data real ones, validate proposed method. manage large-scale distributed systems, data fusion mentioned another data processing ingredient.",19 "redefining context windows word embedding models: experimental study. distributional semantic models learn vector representations words contexts occur in. although choice context (which often takes form sliding window) direct influence resulting embeddings, exact role model component still fully understood. paper presents systematic analysis context windows based set four distinct hyper-parameters. train continuous skip-gram models two english-language corpora various combinations hyper-parameters, evaluate lexical similarity analogy tasks. notable experimental results positive impact cross-sentential contexts surprisingly good performance right-context windows.",4 "connectivity preserving multivalued functions digital topology. study connectivity preserving multivalued functions digital images. notion generalizes continuous multivalued functions studied mostly setting digital plane $z^2$. show connectivity preserving multivalued functions, like continuous multivalued functions, appropriate models digital morpholological operations. connectivity preservation, unlike continuity, preserved compositions, generalizes easily higher dimensions arbitrary adjacency relations.",4 "using linear constraints logic program termination analysis. widely acknowledged function symbols important feature answer set programming, make modeling easier, increase expressive power, allow us deal infinite domains. main issue introduction evaluation program might terminate checking whether terminates undecidable. cope problem, several classes logic programs proposed use function symbols restricted program evaluation termination guaranteed. despite significant body work area, current approaches include many simple practical programs whose evaluation terminates. paper, present novel classes rule-bounded cycle-bounded programs, overcome different limitations current approaches performing global analysis terms propagated body head rules. results correctness, complexity, expressivity proposed approach provided.",4 "assessing value 3d reconstruction building construction. 3-dimensional (3d) reconstruction emerging field image processing computer vision aims create 3d visualizations/ models objects/ scenes image sets. however, commercial applications benefits yet fully explored. paper, describe ongoing work towards assessing value 3d reconstruction building construction domain. present preliminary results user study, objective understand use visual information building construction order determine problems use visual information identify potential benefits scenarios use 3d reconstruction.",4 "segmentation camera captured business card images mobile devices. due huge deformation camera captured images, variety nature business cards computational constraints mobile devices, design efficient business card reader (bcr) challenging researchers. extraction text regions segmenting characters one challenges. paper, presented efficient character segmentation technique business card images captured cell-phone camera, designed present work towards developing efficient bcr. first, text regions extracted card images skewed ones corrected using computationally efficient skew correction technique. last, skew corrected text regions segmented lines characters based horizontal vertical histogram. experiments show present technique efficient applicable mobile devices, mean segmentation accuracy 97.48% achieved 3 mega-pixel (500-600 dpi) images. takes 1.1 seconds segmentation including preprocessing steps moderately powerful notebook (dualcore t2370, 1.73 ghz, 1gb ram, 1mb l2 cache).",4 "ask neurons: neural-based approach answering questions images. address question answering task real-world images set visual turing test. combining latest advances image representation natural language processing, propose neural-image-qa, end-to-end formulation problem parts trained jointly. contrast previous efforts, facing multi-modal problem language output (answer) conditioned visual natural language input (image question). approach neural-image-qa doubles performance previous best approach problem. provide additional insights problem analyzing much information contained language part provide new human baseline. study human consensus, related ambiguities inherent challenging task, propose two novel metrics collect additional answers extends original daquar dataset daquar-consensus.",4 """you jack kennedy"": media selection highlights presidential debates. political speeches debates play important role shaping images politicians, public often relies media outlets select bits political communication large pool utterances. important research question understand factors impact selection process. quantitatively explore selection process, build three- decade dataset presidential debate transcripts post-debate coverage. first examine effect wording propose binary classification framework controls speaker debate situation. find crowdworkers achieve accuracy 60% task, indicating media choices entirely obvious. classifiers outperform crowdworkers average, mainly primary debates. also compare important factors crowdworkers' free-form explanations data-driven methods find interesting differences. crowdworkers mentioned ""context matters"", whereas data show well-quoted sentences distinct previous utterance speaker less-quoted sentences. finally, examine aggregate effect media preferences towards different wordings understand extent fragmentation among media outlets. analyzing bipartite graph built quoting behavior data, observe decreasing trend bipartisan coverage.",4 "phrase-based image captioning hierarchical lstm model. automatic generation caption describe content image gaining lot research interests recently, existing works treat image caption pure sequential data. natural language, however possess temporal hierarchy structure, complex dependencies subsequence. paper, propose phrase-based hierarchical long short-term memory (phi-lstm) model generate image description. contrast conventional solutions generate caption pure sequential manner, proposed model decodes image caption phrase sentence. consists phrase decoder bottom hierarchy decode noun phrases variable length, abbreviated sentence decoder upper hierarchy decode abbreviated form image description. complete image caption formed combining generated phrases sentence inference stage. empirically, proposed model shows better competitive result flickr8k, flickr30k ms-coco datasets comparison state-of-the art models. also show proposed model able generate novel captions (not seen training data) richer word contents three datasets.",4 "fusion stereo still monocular depth estimates self-supervised learning context. study autonomous robots learn improve depth estimation capability. particular, investigate self-supervised learning setup stereo vision depth estimates serve targets convolutional neural network (cnn) transforms single still image dense depth map. training, stereo mono estimates fused novel fusion method preserves high confidence stereo estimates, leveraging cnn estimates low-confidence regions. main contribution article shown fused estimates lead higher performance stereo vision estimates alone. experiments performed kitti dataset, board parrot slamdunk, showing even rather limited cnns help provide stereo vision equipped robots reliable depth maps autonomous navigation.",4 "expandnet: deep convolutional neural network high dynamic range expansion low dynamic range content. high dynamic range (hdr) imaging provides capability handling real world lighting opposed traditional low dynamic range (ldr) struggles accurately represent images higher dynamic range. however, imaging content still available ldr. paper presents method generating hdr content ldr content based deep convolutional neural networks (cnns) termed expandnet. expandnet accepts ldr images input generates images expanded range end-to-end fashion. model attempts reconstruct missing information lost original signal due quantization, clipping, tone mapping gamma correction. added information reconstructed learned features, network trained supervised fashion using dataset hdr images. approach fully automatic data driven; require heuristics human expertise. expandnet uses multiscale architecture avoids use upsampling layers improve image quality. method performs well compared expansion/inverse tone mapping operators quantitatively multiple metrics, even badly exposed inputs.",4 "specification inference demonstrations. learning expert demonstrations received lot attention artificial intelligence machine learning. goal infer underlying reward function agent optimizing given set observations agent's behavior time variety circumstances, system state trajectories, plant model specifying evolution system state different agent's actions. system often modeled markov decision process, is, next state depends current state agent's action, agent's choice action depends current state. former markovian assumption evolution system state, later assumes target reward function markovian. work, explore learning class non-markovian reward functions, known formal methods literature specifications. specifications offer better composition, transferability, interpretability. show inferring specification done efficiently without unrolling transition system. demonstrate 2-d grid world example.",4 "exact reasoning uncertainty. paper focuses designing expert systems support decision making complex, uncertain environments. context, research indicates strictly probabilistic representations, enable use decision-theoretic reasoning, highly preferable recently proposed alternatives (e.g., fuzzy set theory dempster-shafer theory). furthermore, discuss language influence diagrams corresponding methodology -decision analysis -- allows decision theory used effectively efficiently decision-making aid. finally, use rachel, system helps infertile couples select medical treatments, illustrate methodology decision analysis basis expert decision systems.",4 "elicitation strategies fuzzy constraint problems missing preferences: algorithms experimental studies. fuzzy constraints popular approach handle preferences over-constrained problems scenarios one needs cautious, medical space applications. consider fuzzy constraint problems preferences may missing. models, example, settings agents distributed privacy issues, ongoing preference elicitation process. setting, study find solution optimal irrespective missing preferences. process finding solution, may elicit preferences user necessary. however, goal ask user little possible. define combined solving preference elicitation scheme large number different instantiations, corresponding concrete algorithm compare experimentally. compute number elicited preferences ""user effort"", may larger, contains preference values user compute able respond elicitation requests. number elicited preferences important concern communicate little information possible, user effort measures also hidden work user able communicate elicited preferences. experimental results show algorithms good finding necessarily optimal solution asking user small fraction missing preferences. user effort also small best algorithms. finally, test algorithms hard constraint problems possibly missing constraints, aim find feasible solutions irrespective missing constraints.",4 "bio-inspired unsupervised learning visual features leads robust invariant object recognition. retinal image surrounding objects varies tremendously due changes position, size, pose, illumination condition, background context, occlusion, noise, nonrigid deformations. despite huge variations, visual system able invariantly recognize object fraction second. date, various computational models proposed mimic hierarchical processing ventral visual pathway, limited success. here, show association biologically inspired network architecture learning rule significantly improves models' performance facing challenging invariant object recognition problems. model asynchronous feedforward spiking neural network. network presented natural images, neurons entry layers detect edges, activated ones fire first, neurons higher layers equipped spike timing-dependent plasticity. neurons progressively become selective intermediate complexity visual features appropriate object categorization. model evaluated 3d-object eth-80 datasets two benchmarks invariant object recognition, shown outperform state-of-the-art models, including deepconvnet hmax. demonstrates ability accurately recognize different instances multiple object classes even various appearance conditions (different views, scales, tilts, backgrounds). several statistical analysis techniques used show model extracts class specific highly informative features.",4 "lexrank: graph-based lexical centrality salience text summarization. introduce stochastic graph-based method computing relative importance textual units natural language processing. test technique problem text summarization (ts). extractive ts relies concept sentence salience identify important sentences document set documents. salience typically defined terms presence particular important words terms similarity centroid pseudo-sentence. consider new approach, lexrank, computing sentence importance based concept eigenvector centrality graph representation sentences. model, connectivity matrix based intra-sentence cosine similarity used adjacency matrix graph representation sentences. system, based lexrank ranked first place one task recent duc 2004 evaluation. paper present detailed analysis approach apply larger data set including data earlier duc evaluations. discuss several methods compute centrality using similarity graph. results show degree-based methods (including lexrank) outperform centroid-based methods systems participating duc cases. furthermore, lexrank threshold method outperforms degree-based techniques including continuous lexrank. also show approach quite insensitive noise data may result imperfect topical clustering documents.",4 "action schema networks: generalised policies deep learning. paper, introduce action schema network (asnet): neural network architecture learning generalised policies probabilistic planning problems. mimicking relational structure planning problems, asnets able adopt weight-sharing scheme allows network applied problem given planning domain. allows cost training network amortised problems domain. further, propose training method balances exploration supervised training small problems produce policy remains robust evaluated larger problems. experiments, show asnet's learning capability allows significantly outperform traditional non-learning planners several challenging domains.",4 "bilinear cnns fine-grained visual recognition. present simple effective architecture fine-grained visual recognition called bilinear convolutional neural networks (b-cnns). networks represent image pooled outer product features derived two cnns capture localized feature interactions translationally invariant manner. b-cnns belong class orderless texture representations unlike prior work trained end-to-end manner. accurate model obtains 84.1%, 79.4%, 86.9% 91.3% per-image accuracy caltech-ucsd birds [67], nabirds [64], fgvc aircraft [42], stanford cars [33] dataset respectively runs 30 frames-per-second nvidia titan x gpu. present systematic analysis networks show (1) bilinear features highly redundant reduced order magnitude size without significant loss accuracy, (2) also effective image classification tasks texture scene recognition, (3) trained scratch imagenet dataset offering consistent improvements baseline architecture. finally, present visualizations models various datasets using top activations neural units gradient-based inversion techniques. source code complete system available http://vis-www.cs.umass.edu/bcnn.",4 "segmentation facial expressions using semi-definite programming generalized principal component analysis. paper, use semi-definite programming generalized principal component analysis (gpca) distinguish two different facial expressions. first step, semi-definite programming used reduce dimension image data ""unfold"" manifold data points (corresponding facial expressions) reside on. next, gpca used fit series subspaces data points associate data point subspace. data points belong subspace claimed belong facial expression category. example provided.",4 "pde-net: learning pdes data. paper, present initial attempt learn evolution pdes data. inspired latest development neural network designs deep learning, propose new feed-forward deep network, called pde-net, fulfill two objectives time: accurately predict dynamics complex systems uncover underlying hidden pde models. basic idea proposed pde-net learn differential operators learning convolution kernels (filters), apply neural networks machine learning methods approximate unknown nonlinear responses. comparing existing approaches, either assume form nonlinear response known fix certain finite difference approximations differential operators, approach flexibility learning differential operators nonlinear responses. special feature proposed pde-net filters properly constrained, enables us easily identify governing pde models still maintaining expressive predictive power network. constrains carefully designed fully exploiting relation orders differential operators orders sum rules filters (an important concept originated wavelet theory). also discuss relations pde-net existing networks computer vision network-in-network (nin) residual neural network (resnet). numerical experiments show pde-net potential uncover hidden pde observed dynamics, predict dynamical behavior relatively long time, even noisy environment.",12 "knowledge structures evidential reasoning decision analysis. roles played decision factors making complex subject decisions characterized factors affect overall decision. evidence partially matches factor evaluated, effective computational rules applied roles form appropriate aggregation evidence. use technique supports expression deeper levels causality, may also preserve cognitive structure decision maker better usual weighting methods, certainty-factor probabilistic models can.",4 "learning natural language inference lstm. natural language inference (nli) fundamentally important task natural language processing many applications. recently released stanford natural language inference (snli) corpus made possible develop evaluate learning-centered methods deep neural networks natural language inference (nli). paper, propose special long short-term memory (lstm) architecture nli. model builds top recently proposed neural attention model nli based significantly different idea. instead deriving sentence embeddings premise hypothesis used classification, solution uses match-lstm perform word-by-word matching hypothesis premise. lstm able place emphasis important word-level matching results. particular, observe lstm remembers important mismatches critical predicting contradiction neutral relationship label. snli corpus, model achieves accuracy 86.1%, outperforming state art.",4 "identification parameters underlying emotions classification emotions. standard classification emotions involves categorizing expression emotions. paper, parameters underlying emotions identified new classification based parameters suggested.",4 "morphology generation statistical machine translation. translating morphologically rich languages, statistical mt approaches face problem data sparsity. severity sparseness problem high corpus size morphologically richer language less. even though use factored models correctly generate morphological forms words, problem data sparseness limits performance. paper, describe simple effective solution based enriching input corpora various morphological forms words. use method phrase-based factor-based experiments two morphologically rich languages: hindi marathi translating english. evaluate performance experiments terms automatic evaluation subjective evaluation adequacy fluency. observe morphology injection method helps improving quality translation. analyze morph injection method helps handling data sparseness problem great level.",4 "prediction optimal threshold value df relay selection schemes based artificial neural networks. wireless communications, cooperative communication (cc) technology promises performance gains compared traditional single-input single output (siso) techniques. therefore, cc technique one nominees 5g networks. decode-and-forward (df) relaying scheme one cc techniques, determination threshold value relay key role system performance power usage. paper, propose prediction optimal threshold values best relay selection scheme cooperative communications, based artificial neural networks (anns) first time literature. average link qualities number relays used inputs prediction optimal threshold values using artificial neural networks (anns): multi-layer perceptron (mlp) radial basis function (rbf) networks. mlp network better performance rbf network prediction optimal threshold value number neurons used hidden layer networks. besides, optimal threshold values obtained using anns verified optimal threshold values obtained numerically using closed form expression derived system. results show optimal threshold values obtained anns best relay selection scheme provide minimum bit-error-rate (ber) reduction probability error propagation may occur. also, ber performance goal, prediction optimal threshold values provides 2db less power usage, great gain terms green communicationber performance goal, prediction optimal threshold values provides 2db less power usage, great gain terms green communication.",6 "negative tree reweighted belief propagation. introduce new class lower bounds log partition function markov random field makes use reversed jensen's inequality. particular, method approximates intractable distribution using linear combination spanning trees negative weights. technique lower-bound counterpart tree-reweighted belief propagation algorithm, uses convex combination spanning trees positive weights provide corresponding upper bounds. develop algorithms optimize tighten lower bounds non-convex set valid parameter values. algorithm generalizes mean field approaches (including naive structured mean field approximations), includes limiting case.",4 "universal prediction bayesian confirmation. bayesian framework well-studied successful framework inductive reasoning, includes hypothesis testing confirmation, parameter estimation, sequence prediction, classification, regression. standard statistical guidelines choosing model class prior always available fail, particular complex situations. solomonoff completed bayesian framework providing rigorous, unique, formal, universal choice model class prior. discuss breadth sense universal (non-i.i.d.) sequence prediction solves various (philosophical) problems traditional bayesian sequence prediction. show solomonoff's model possesses many desirable properties: strong total weak instantaneous bounds, contrast classical continuous prior densities zero p(oste)rior problem, i.e. confirm universal hypotheses, reparametrization regrouping invariant, avoids old-evidence updating problem. even performs well (actually better) non-computable environments.",12 "tree-cut probabilistic image segmentation. paper presents new probabilistic generative model image segmentation, i.e. task partitioning image homogeneous regions. model grounded mid-level image representation, called region tree, regions recursively split subregions superpixels reached. given region tree, image segmentation formalized sampling cuts tree model. inference cuts exact, formulated using dynamic programming. tree-cut model tuned sample segmentations particular scale interest many possible multiscale image segmentations. generalizes common notion one correct segmentation per image. also, allows moving beyond standard single-scale evaluation, segmentation result image averaged corresponding set coarse fine human annotations, conduct scale-specific evaluation. quantitative results comparable leading gpb-owt-ucm method, notable advantage additionally produce distribution possible tree-consistent segmentations image.",19 "cgans projection discriminator. propose novel, projection based way incorporate conditional information discriminator gans respects role conditional information underlining probabilistic model. approach contrast frameworks conditional gans used application today, use conditional information concatenating (embedded) conditional vector feature vectors. modification, able significantly improve quality class conditional image generation ilsvrc2012 (imagenet) 1000-class image dataset current state-of-the-art result, achieved single pair discriminator generator. also able extend application super-resolution succeeded producing highly discriminative super-resolution images. new structure also enabled high quality category transformation based parametric functional transformation conditional batch normalization layers generator.",4 survey video forgery detection. digital forgeries though visibly identifiable human perception may alter meddle underlying natural statistics digital content. tampering involves fiddling video content order cause damage make unauthorized alteration/modification. tampering detection video cumbersome compared image considering properties video. tampering impacts need studied applied technique/method used establish factual information legal course judiciary. paper give overview prior literature challenges involved video forgery detection passive approach found.,4 "heavy hitters via cluster-preserving clustering. turnstile $\ell_p$ $\varepsilon$-heavy hitters, one maintains high-dimensional $x\in\mathbb{r}^n$ subject $\texttt{update}(i,\delta)$ causing $x_i\leftarrow x_i + \delta$, $i\in[n]$, $\delta\in\mathbb{r}$. upon receiving query, goal report small list $l\subset[n]$, $|l| = o(1/\varepsilon^p)$, containing every ""heavy hitter"" $i\in[n]$ $|x_i| \ge \varepsilon \|x_{\overline{1/\varepsilon^p}}\|_p$, $x_{\overline{k}}$ denotes vector obtained zeroing largest $k$ entries $x$ magnitude. $p\in(0,2]$ countsketch solves $\ell_p$ heavy hitters using $o(\varepsilon^{-p}\log n)$ words space $o(\log n)$ update time, $o(n\log n)$ query time output $l$, whose output query correct high probability (whp) $1 - 1/poly(n)$. unfortunately query time slow. remedy this, work [cm05] proposed $p=1$ strict turnstile model, whp correct algorithm achieving suboptimal space $o(\varepsilon^{-1}\log^2 n)$, worse update time $o(\log^2 n)$, much better query time $o(\varepsilon^{-1}poly(\log n))$. show tradeoff space update time versus query time unnecessary. provide new algorithm, expandersketch, general turnstile model achieves optimal $o(\varepsilon^{-p}\log n)$ space, $o(\log n)$ update time, fast $o(\varepsilon^{-p}poly(\log n))$ query time, whp correctness. main innovation efficient reduction heavy hitters clustering problem heavy hitter encoded form noisy spectral cluster much bigger graph, goal identify every cluster. since every heavy hitter must found, correctness requires every cluster found. develop ""cluster-preserving clustering"" algorithm, partitioning graph clusters without destroying original cluster.",4 "fast factorization-based approach robust pca. robust principal component analysis (rpca) widely used recovering low-rank matrices many data mining machine learning problems. separates data matrix low-rank part sparse part. convex approach well studied literature. however, state-of-the-art algorithms convex approach usually relatively high complexity due need solving (partial) singular value decompositions large matrices. non-convex approach, altproj, also proposed lighter complexity better scalability. given true rank $r$ underlying low rank matrix, altproj complexity $o(r^2dn)$, $d\times n$ size data matrix. paper, propose novel factorization-based model rpca, complexity $o(kdn)$, $k$ upper bound true rank. method need precise value true rank. extensive experiments, observe altproj work $r$ precisely known advance; however, needed rank parameter $r$ specified value different true rank, altproj cannot fully separate two parts method succeeds. even work, method 4 times faster altproj. method used light-weight, scalable tool rpca absence precise value true rank.",4 "fractal dimension analysis automatic morphological galaxy classification. report present experimental results using \emph{haussdorf-besicovich} fractal dimension performing morphological galaxy classification. fractal dimension topological, structural spatial property give us information space object lives. calculated fractal dimension value main types galaxies: ellipticals, spirals irregulars; use feature classifying them. also, performed image analysis process order standardize galaxy images, used principal component analysis obtain main attributes images. galaxy classification performed using machine learning algorithms: c4.5, k-nearest neighbors, random forest support vector machines. preliminary experimental results using 10-fold cross-validation show fractal dimension helps improve classification, 88 per cent accuracy elliptical galaxies, 100 per cent accuracy spiral galaxies 40 per cent irregular galaxies.",4 "randomized robust subspace recovery high dimensional data matrices. paper explores analyzes two randomized designs robust principal component analysis (pca) employing low-dimensional data sketching. one design, data sketch constructed using random column sampling followed low dimensional embedding, other, sketching based random column row sampling. designs shown bring substantial savings complexity memory requirements robust subspace learning conventional approaches use full scale data. characterization sample computational complexity designs derived context two distinct outlier models, namely, sparse independent outlier models. proposed randomized approach provably recover correct subspace computational sample complexity almost independent size data. results mathematical analysis confirmed numerical simulations using synthetic real data.",19 "learning learn weak supervision full supervision. paper, propose method training neural networks large set data weak labels small amount data true labels. proposed model, train two neural networks: target network, learner confidence network, meta-learner. target network optimized perform given task trained using large set unlabeled data weakly annotated. propose control magnitude gradient updates target network using scores provided second confidence network, trained small amount supervised data. thus avoid weight updates computed noisy labels harm quality target network model.",19 "parametric gaussian process regression big data. work introduces concept parametric gaussian processes (pgps), built upon seemingly self-contradictory idea making gaussian processes parametric. parametric gaussian processes, construction, designed operate ""big data"" regimes one interested quantifying uncertainty associated noisy data. proposed methodology circumvents well-established need stochastic variational inference, scalable algorithm approximating posterior distributions. effectiveness proposed approach demonstrated using illustrative example simulated data benchmark dataset airline industry approximately 6 million records.",19 "philosophical essay life connections genetic algorithms. paper makes number connections life various facets genetic evolutionary algorithms research. specifically, addresses topics adaptation, multiobjective optimization, decision making, deception, search operators, among others. argues human life, birth death, adaptive dynamic optimization problem people continuously searching happiness. important, paper speculates genetic algorithms used source inspiration helping people make decisions everyday life.",4 "3d scan registration using curvelet features planetary environments. topographic mapping planetary environments relies accurate 3d scan registration methods. however, global registration algorithms relying features fpfh harris-3d show poor alignment accuracy settings due poor structure mars-like terrain variable resolution, occluded, sparse range data hard register without a-priori knowledge environment. paper, propose alternative approach 3d scan registration using curvelet transform performs multi-resolution geometric analysis obtain set coefficients indexed scale (coarsest finest), angle spatial position. features detected curvelet domain take advantage directional selectivity transform. descriptor computed feature calculating 3d spatial histogram image gradients, nearest neighbor based matching used calculate feature correspondences. correspondence rejection using random sample consensus identifies inliers, locally optimal singular value decomposition-based estimation rigid-body transformation aligns laser scans given re-projected correspondences metric space. experimental results publicly available data-set planetary analogue indoor facility, well simulated real-world scans neptec design group's ivigms 3d laser rangefinder outdoor csa mars yard demonstrates improved performance existing methods challenging sparse mars-like terrain.",4 "boost picking: universal method converting supervised classification semi-supervised classification. paper proposes universal method, boost picking, train supervised classification models mainly un-labeled data. boost picking adopts two weak classifiers estimate correct error. theoretically proved boost picking could train supervised model mainly un-labeled data effectively model trained 100% labeled data, recalls two weak classifiers greater zero sum precisions greater one. based boost picking, present ""test along training (tawt)"" improve generalization supervised models. boost picking tawt successfully tested varied little data sets.",4 "demystifying alphago zero alphago gan. astonishing success alphago zero\cite{silver_alphago} invokes worldwide discussion future human society mixed mood hope, anxiousness, excitement fear. try dymystify alphago zero qualitative analysis indicate alphago zero understood specially structured gan system expected possess inherent good convergence property. thus deduct success alphago zero may sign new generation ai.",4 "kernel classification framework metric learning. learning distance metric given training samples plays crucial role many machine learning tasks, various models optimization algorithms proposed past decade. paper, generalize several state-of-the-art metric learning methods, large margin nearest neighbor (lmnn) information theoretic metric learning (itml), kernel classification framework. first, doublets triplets constructed training samples, family degree-2 polynomial kernel functions proposed pairs doublets triplets. then, kernel classification framework established, generalize many popular metric learning methods lmnn itml, also suggest new metric learning methods, efficiently implemented, interestingly, using standard support vector machine (svm) solvers. two novel metric learning methods, namely doublet-svm triplet-svm, developed proposed framework. experimental results show doublet-svm triplet-svm achieve competitive classification accuracies state-of-the-art metric learning methods itml lmnn significantly less training time.",4 "understanding deep convolutional networks. deep convolutional networks provide state art classifications regressions results many high-dimensional problems. review architecture, scatters data cascade linear filter weights non-linearities. mathematical framework introduced analyze properties. computations invariants involve multiscale contractions, linearization hierarchical symmetries, sparse separations. applications discussed.",19 "toward deeper understanding nonconvex stochastic optimization momentum using diffusion approximations. momentum stochastic gradient descent (msgd) algorithm widely applied many nonconvex optimization problems machine learning. popular examples include training deep neural networks, dimensionality reduction, etc. due lack convexity extra momentum term, optimization theory msgd still largely unknown. paper, study fundamental optimization algorithm based so-called ""strict saddle problem."" diffusion approximation type analysis, study shows momentum helps escape saddle points, hurts convergence within neighborhood optima (if without step size annealing). theoretical discovery partially corroborates empirical success msgd training deep neural networks. moreover, analysis applies martingale method ""fixed-state-chain"" method stochastic approximation literature, independent interest.",4 "automos: learning non-intrusive assessor naturalness-of-speech. developers text-to-speech synthesizers (tts) often make use human raters assess quality synthesized speech. demonstrate model human raters' mean opinion scores (mos) synthesized speech using deep recurrent neural network whose inputs consist solely raw waveform. best models provide utterance-level estimates mos moderately inferior sampled human ratings, shown pearson spearman correlations. multiple utterances scored averaged, scenario common synthesizer quality assessment, automos achieves correlations approaching human raters. automos model number applications, ability explore parameter space speech synthesizer without requiring human-in-the-loop.",4 "sparse signal recovery temporally correlated source vectors using sparse bayesian learning. address sparse signal recovery problem context multiple measurement vectors (mmv) elements nonzero row solution matrix temporally correlated. existing algorithms consider temporal correlations thus performance degrades significantly correlations. work, propose block sparse bayesian learning framework models temporal correlations. framework derive two sparse bayesian learning (sbl) algorithms, superior recovery performance compared existing algorithms, especially presence high temporal correlations. furthermore, algorithms better handling highly underdetermined problems require less row-sparsity solution matrix. also provide analysis global local minima cost function, show sbl cost function desirable property global minimum sparsest solution mmv problem. extensive experiments also provide interesting results motivate future theoretical research mmv model.",19 "towards understanding feedback supermassive black holes using convolutional neural networks. supermassive black holes centers clusters galaxies strongly interact host environment via agn feedback. key tracers activity x-ray cavities -- regions lower x-ray brightness within cluster. present automatic method detecting, characterizing x-ray cavities noisy, low-resolution x-ray images. simulate clusters galaxies, insert cavities them, produce realistic low-quality images comparable observations high redshifts. train custom-built convolutional neural network generate pixel-wise analysis presence cavities cluster. resnet architecture used decode radii cavities pixel-wise predictions. surpass accuracy, stability, speed current visual inspection based methods simulated data.",1 "recognition networks approximate inference bn20 networks. propose using recognition networks approximate inference inbayesian networks (bns). recognition network multilayerperception (mlp) trained predict posterior marginals given observedevidence particular bn. input mlp vector thestates evidential nodes. activity output unit isinterpreted prediction posterior marginal thecorresponding variable. mlp trained using samples generated fromthe corresponding bn.we evaluate recognition network trained inference ina large bayesian network, similar structure complexity thequick medical reference, decision theoretic (qmr-dt). networkis binary, two-layer, noisy-or network containing 4000 potentially observable nodes 600 unobservable, hidden nodes. inreal medical diagnosis, observables unavailable, isa complex unknown bias selects ones provided. weincorporate basic type selection bias network: knownpreference available observables positive rather negative.even simple bias significant effect posterior. compare performance recognition network tostate-of-the-art approximate inference algorithms large set oftest cases. order evaluate effect simplistic modelof selection bias, evaluate algorithms using variety ofincorrectly modeled observation biases. recognition networks performwell using correct incorrect observation biases.",4 "new sufficient condition 1-coverage imply connectivity. effective approach energy conservation wireless sensor networks scheduling sleep intervals extraneous nodes remaining nodes stay active provide continuous service. sensor network operate successfully active nodes must maintain sensing coverage network connectivity, proved communication range nodes least twice sensing range, complete coverage convex area implies connectivity among working set nodes. paper consider rectangular region = *b, r r b {\pounds}, {\pounds}, r sensing range nodes. put constraint minimum allowed distance nodes(s). according constraint present new lower bound communication range relative sensing range sensors(s 2 + 3 *r) complete coverage considered area implies connectivity among working set nodes; also present new distribution method, satisfy constraint.",4 "stochastic approximation canonical correlation analysis. propose novel first-order stochastic approximation algorithms canonical correlation analysis (cca). algorithms presented instances inexact matrix stochastic gradient (msg) inexact matrix exponentiated gradient (meg), achieve $\epsilon$-suboptimality population objective $\operatorname{poly}(\frac{1}{\epsilon})$ iterations. also consider practical variants proposed algorithms compare methods cca theoretically empirically.",4 "learning deep structured active contours end-to-end. world covered millions buildings, precisely knowing instance's position extents vital multitude applications. recently, automated building footprint segmentation models shown superior detection accuracy thanks usage convolutional neural networks (cnn). however, even latest evolutions struggle precisely delineating borders, often leads geometric distortions inadvertent fusion adjacent building instances. propose overcome issue exploiting distinct geometric properties buildings. end, present deep structured active contours (dsac), novel framework integrates priors constraints segmentation process, continuous boundaries, smooth edges, sharp corners. so, dsac employs active contour models (acm), family constraint- prior-based polygonal models. learn acm parameterizations per instance using cnn, show incorporate components structured output model, making dsac trainable end-to-end. evaluate dsac three challenging building instance segmentation datasets, compares favorably state-of-the-art. code made available.",4 "learning unbiased features. key element transfer learning representation learning; representations developed expose relevant factors underlying data, new tasks domains learned readily based mappings salient factors. propose important aim representations unbiased. different forms representation learning derived alternative definitions unwanted bias, e.g., bias particular tasks, domains, irrelevant underlying data dimensions. one useful approach estimating amount bias representation comes maximum mean discrepancy (mmd) [5], measure distance probability distributions. first suggest mmd useful criterion developing representations apply across multiple domains tasks [1]. however, paper describe number novel applications criterion devised, based idea developing unbiased representations. formulations include: standard domain adaptation framework; method learning invariant representations; approach based noise-insensitive autoencoders; novel form generative model.",4 "constraint logic programming approach computing ordinal conditional functions. order give appropriate semantics qualitative conditionals form ""if normally b"", ordinal conditional functions (ocfs) ranking possible worlds according degree plausibility used. ocf accepting conditionals knowledge base r characterized solution constraint satisfaction problem. present high-level, declarative approach using constraint logic programming techniques solving constraint satisfaction problem. particular, approach developed supports generation minimal solutions; minimal solutions special interest provide basis model-based inference r.",4 "memory shapes time perception intertemporal choices. consensus human non-human subjects experience temporal distortions many stages perceptual decision-making systems. similarly, intertemporal choice research shown decision-makers undervalue future outcomes relative immediate ones. combine techniques information theory artificial intelligence show temporal distortions intertemporal choice preferences explained consequence coding efficiency sensorimotor representation. particular, model implies interactions constrain future behavior perceived longer duration valuable. furthermore, using simulations artificial agents, investigate memory constraints enforce renormalization perceived timescales. results show qualitatively different discount functions, exponential hyperbolic discounting, arise consequence agent's probabilistic model world.",16 "joint multimodal learning deep generative models. investigate deep generative models exchange multiple modalities bi-directionally, e.g., generating images corresponding texts vice versa. recently, studies handle multiple modalities deep generative models, variational autoencoders (vaes). however, models typically assume modalities forced conditioned relation, i.e., generate modalities one direction. achieve objective, extract joint representation captures high-level concepts among modalities exchange bi-directionally. described herein, propose joint multimodal variational autoencoder (jmvae), modalities independently conditioned joint representation. words, models joint distribution modalities. furthermore, able generate missing modalities remaining modalities properly, develop additional method, jmvae-kl, trained reducing divergence jmvae's encoder prepared networks respective modalities. experiments show proposed method obtain appropriate joint representation multiple modalities generate reconstruct properly conventional vaes. demonstrate jmvae generate multiple modalities bi-directionally.",19 "information-theoretic representation learning positive-unlabeled classification. recent advances weakly supervised classification allow us train classifier positive unlabeled (pu) data. however, existing pu classification methods typically require accurate estimate class-prior probability, critical bottleneck particularly high-dimensional data. problem commonly addressed applying principal component analysis advance, unsupervised dimension reduction collapse underlying class structure. paper, propose novel representation learning method pu data based information-maximization principle. method require class-prior estimation thus used preprocessing method pu classification. experiments, demonstrate method combined deep neural networks highly improves accuracy pu class-prior estimation, leading state-of-the-art pu classification performance.",19 "review convolutional neural networks inverse problems imaging. survey paper, review recent uses convolution neural networks (cnns) solve inverse problems imaging. recently become feasible train deep cnns large databases images, shown outstanding performance object classification segmentation tasks. motivated successes, researchers begun apply cnns resolution inverse problems denoising, deconvolution, super-resolution, medical image reconstruction, started report improvements state-of-the-art methods, including sparsity-based techniques compressed sensing. here, review recent experimental work areas, focus critical design decisions: training data come from? architecture cnn? learning problem formulated solved? also bring together key theoretical papers offer perspective cnns appropriate inverse problems point next steps field.",6 "sparse autoencoder unsupervised nucleus detection representation histopathology images. histopathology images crucial study complex diseases cancer. histologic characteristics nuclei play key role disease diagnosis, prognosis analysis. work, propose sparse convolutional autoencoder (cae) fully unsupervised, simultaneous nucleus detection feature extraction histopathology tissue images. cae detects encodes nuclei image patches tissue images sparse feature maps encode location appearance nuclei. cae first unsupervised detection network computer vision applications. pretrained nucleus detection feature extraction modules cae fine-tuned supervised learning end-to-end fashion. evaluate method four datasets reduce errors state-of-the-art methods 42%. able achieve comparable performance 5% fully-supervised annotation cost.",4 "survey emotional body gesture recognition. automatic emotion recognition become trending research topic past decade. works based facial expressions speech abound, recognizing affect body gestures remains less explored topic. present new comprehensive survey hoping boost research field. first introduce emotional body gestures component commonly known ""body language"" comment general aspects gender differences culture dependence. define complete framework automatic emotional body gesture recognition. introduce person detection comment static dynamic body pose estimation methods rgb 3d. comment recent literature related representation learning emotion recognition images emotionally expressive gestures. also discuss multi-modal approaches combine speech face body gestures improved emotion recognition. pre-processing methodologies (e.g. human detection pose estimation) nowadays mature technologies fully developed robust large scale analysis, show emotion recognition quantity labelled data scarce, agreement clearly defined output spaces representations shallow largely based naive geometrical representations.",4 "analysing fuzzy sets combining measures similarity distance. reasoning fuzzy sets achieved measures similarity distance. however, measures often give misleading results considered independently, example giving value two different pairs fuzzy sets. particularly problem many fuzzy sets generated real data, two different measures may used automatically compare fuzzy sets, difficult interpret two different results. especially true large number fuzzy sets compared part reasoning system. paper introduces method combining results multiple measures single measure purpose analysing comparing fuzzy sets. combined measure alleviates ambiguous results aids automatic comparison fuzzy sets. properties combined measure given, demonstrations presented discussions advantages using single measure.",4 "variational reasoning question answering knowledge graph. knowledge graph (kg) known helpful task question answering (qa), since provides well-structured relational information entities, allows one infer indirect facts. however, challenging build qa systems learn reason knowledge graphs based question-answer pairs alone. first, people ask questions, expressions noisy (for example, typos texts, variations pronunciations), non-trivial qa system match mentioned entities knowledge graph. second, many questions require multi-hop logic reasoning knowledge graph retrieve answers. address challenges, propose novel unified deep learning architecture, end-to-end variational learning algorithm handle noise questions, learn multi-hop reasoning simultaneously. method achieves state-of-the-art performance recent benchmark dataset literature. also derive series new benchmark datasets, including questions multi-hop reasoning, questions paraphrased neural translation model, questions human voice. method yields promising results challenging datasets.",4 "seeing behind camera: identifying authorship photograph. introduce novel problem identifying photographer behind photograph. explore feasibility current computer vision techniques address problem, created new dataset 180,000 images taken 41 well-known photographers. using dataset, examined effectiveness variety features (low high-level, including cnn features) identifying photographer. also trained new deep convolutional neural network task. results show high-level features greatly outperform low-level features. provide qualitative results using learned models give insight method's ability distinguish photographers, allow us draw interesting conclusions specific photographers shoot. also demonstrate two applications method.",4 "grammatical case based is-a relation extraction boosting polish. pattern-based methods is-a relation extraction rely heavily called hearst patterns. ways expressing instance enumerations class natural language. lexico-syntactic patterns prove quite useful, may capture taxonomical relations expressed text. therefore paper describe novel method is-a relation extraction patterns, uses morpho-syntactical annotations along grammatical case noun phrases constitute entities participating is-a relation. also describe method increasing number extracted relations call pseudo-subclass boosting potential application pattern-based relation extraction method. experiments conducted corpus 0.5 billion web documents polish language.",4 "deep long short-term memory adaptive beamforming networks multichannel robust speech recognition. far-field speech recognition noisy reverberant conditions remains challenging problem despite recent deep learning breakthroughs. problem commonly addressed acquiring speech signal multiple microphones performing beamforming them. paper, propose use recurrent neural network long short-term memory (lstm) architecture adaptively estimate real-time beamforming filter coefficients cope non-stationary environmental noise dynamic nature source microphones positions results set timevarying room impulse responses. lstm adaptive beamformer jointly trained deep lstm acoustic model predict senone labels. further, use hidden units deep lstm acoustic model assist predicting beamforming filter coefficients. proposed system achieves 7.97% absolute gain baseline systems beamforming chime-3 real evaluation set.",6 "depth separation neural networks. let $f:\mathbb{s}^{d-1}\times \mathbb{s}^{d-1}\to\mathbb{s}$ function form $f(\mathbf{x},\mathbf{x}') = g(\langle\mathbf{x},\mathbf{x}'\rangle)$ $g:[-1,1]\to \mathbb{r}$. give simple proof shows poly-size depth two neural networks (exponentially) bounded weights cannot approximate $f$ whenever $g$ cannot approximated low degree polynomial. moreover, many $g$'s, $g(x)=\sin(\pi d^3x)$, number neurons must $2^{\omega\left(d\log(d)\right)}$. furthermore, result holds w.r.t.\ uniform distribution $\mathbb{s}^{d-1}\times \mathbb{s}^{d-1}$. many functions form well approximated poly-size depth three networks poly-bounded weights, establishes separation depth two depth three networks w.r.t.\ uniform distribution $\mathbb{s}^{d-1}\times \mathbb{s}^{d-1}$.",4 "scale nonlinear component analysis doubly stochastic gradients. nonlinear component analysis kernel principle component analysis (kpca) kernel canonical correlation analysis (kcca) widely used machine learning, statistics data analysis, scale big datasets. recent attempts employed random feature approximations convert problem primal form linear computational complexity. however, obtain high quality solutions, number random features order magnitude number data points, making approach directly applicable regime millions data points. propose simple, computationally efficient, memory friendly algorithm based ""doubly stochastic gradients"" scale range kernel nonlinear component analysis, kernel pca, cca svd. despite \emph{non-convex} nature problems, method enjoys theoretical guarantees converges rate $\tilde{o}(1/t)$ global optimum, even top $k$ eigen subspace. unlike many alternatives, algorithm require explicit orthogonalization, infeasible big datasets. demonstrate effectiveness scalability algorithm large scale synthetic real world datasets.",4 "semantic segmentation limited training data. present approach robotic perception cluttered scenes led winning recent amazon robotics challenge (arc) 2017. next small objects shiny transparent surfaces, biggest challenge 2017 competition introduction unseen categories. contrast traditional approaches require large collections annotated data many hours training, task obtain robust perception pipeline minutes data acquisition training time. end, present two strategies explored. one deep metric learning approach works three separate steps: semantic-agnostic boundary detection, patch classification pixel-wise voting. fully-supervised semantic segmentation approach efficient dataset collection. conduct extensive analysis two methods arc 2017 dataset. interestingly, examples class sufficient fine-tune even deep convolutional neural networks specific task.",4 "fast threshold tests detecting discrimination. threshold tests recently proposed useful method detecting bias lending, hiring, policing decisions. example, case credit extensions, tests aim estimate bar granting loans white minority applicants, higher inferred threshold minorities indicative discrimination. technique, however, requires fitting complex bayesian latent variable model inference often computationally challenging. develop method fitting threshold tests two orders magnitude faster existing approach, reducing computation hours minutes. achieve performance gains, introduce analyze flexible family probability distributions interval [0, 1] -- call discriminant distributions -- computationally efficient work with. demonstrate technique analyzing 2.7 million police stops pedestrians new york city.",19 "fast optical flow using dense inverse search. recent works optical flow extraction focus accuracy neglect time complexity. however, real-life visual applications, tracking, activity detection recognition, time complexity critical. propose solution low time complexity competitive accuracy computation dense optical flow. consists three parts: 1) inverse search patch correspondences; 2) dense displacement field creation patch aggregation along multiple scales; 3) variational refinement. core dense inverse search-based method (dis) efficient search correspondences inspired inverse compositional image alignment proposed baker matthews 2001. dis competitive standard optical flow benchmarks large displacements. dis runs 300hz 600hz single cpu core, reaching temporal resolution human's biological vision system. order(s) magnitude faster state-of-the-art methods range accuracy, making dis ideal visual applications.",4 "unifying probabilistic perspective spectral dimensionality reduction: insights new models. introduce new perspective spectral dimensionality reduction views methods gaussian markov random fields (grfs). unifying perspective based maximum entropy principle turn inspired maximum variance unfolding. resulting model, call maximum entropy unfolding (meu) nonlinear generalization principal component analysis. relate model laplacian eigenmaps isomap. show parameter fitting locally linear embedding (lle) approximate maximum likelihood meu. introduce variant lle performs maximum likelihood exactly: acyclic lle (alle). show meu alle competitive leading spectral approaches robot navigation visualization human motion capture data set. finally maximum likelihood perspective allows us introduce new approach dimensionality reduction based l1 regularization gaussian random field via graphical lasso.",4 "understanding symmetries deep networks. recent works highlighted scale invariance symmetry present weight space typical deep network adverse effect euclidean gradient based stochastic gradient descent optimization. work, show commonly used deep network, uses convolution, batch normalization, relu, max-pooling, sub-sampling pipeline, possess complex forms symmetry arising scaling-based reparameterization network weights. propose tackle issue weight space symmetry constraining filters lie unit-norm manifold. consequently, training network boils using stochastic gradient descent updates unit-norm manifold. empirical evidence based mnist dataset shows proposed updates improve test performance beyond achieved batch normalization without sacrificing computational efficiency weight updates.",4 "hubs languages: scale free networks synonyms. natural languages described paper terms networks synonyms: word identified node, synonyms connected undirected links. statistical analysis network synonyms polish language showed scale-free; similar known english. statistical properties networks also similar. thus, statistical aspects networks good candidates culture independent elements human language. hypothesize optimization robustness efficiency responsible universality. despite statistical similarity, one-to-one mapping networks two languages. although many hubs polish translated similarly highly connected hubs english, also hubs specific one languages only: single word one language equivalent many different disconnected words other, accordance whorf hypothesis language relativity. identifying language-specific hubs vitally important automatic translation, understanding contextual, culturally related messages frequently missed twisted naive, literary translation.",15 "fast marginalized block sparse bayesian learning algorithm. performance sparse signal recovery noise corrupted, underdetermined measurements improved sparsity correlation structure signals exploited. one typical correlation structure intra-block correlation block sparse signals. exploit structure, framework, called block sparse bayesian learning (bsbl), proposed recently. algorithms derived framework showed superior performance fast, limits applications. work derives efficient algorithm framework, using marginalized likelihood maximization method. compared existing bsbl algorithms, close recovery performance much faster. therefore, suitable large scale datasets applications requiring real-time implementation.",4 "lightlda: big topic models modest compute clusters. building large-scale machine learning (ml) programs, big topic models deep neural nets, one usually assumes tasks attempted industrial-sized clusters thousands nodes, reach practitioners academic researchers. consider challenge context topic modeling web-scale corpora, show modest cluster 8 machines, train topic model 1 million topics 1-million-word vocabulary (for total 1 trillion parameters), document collection 200 billion tokens -- scale yet reported even thousands machines. major contributions include: 1) new, highly efficient o(1) metropolis-hastings sampling algorithm, whose running cost (surprisingly) agnostic model size, empirically converges nearly order magnitude faster current state-of-the-art gibbs samplers; 2) structure-aware model-parallel scheme, leverages dependencies within topic model, yielding sampling strategy frugal machine memory network communication; 3) differential data-structure model storage, uses separate data structures high- low-frequency words allow extremely large models fit memory, maintaining high inference speed; 4) bounded asynchronous data-parallel scheme, allows efficient distributed processing massive data via parameter server. distribution strategy instance model-and-data-parallel programming model underlying petuum framework general distributed ml, implemented top petuum open-source system. provide experimental evidence showing development puts massive models within reach small cluster still enjoying proportional time cost reductions increasing cluster size, comparison alternative options.",19 "practical inventory routing: problem definition optimization method. global objective work provide practical optimization methods companies involved inventory routing problems, taking account new type data. also, companies sometimes able deal changing plans every period would like adopt regular structures serving customers.",4 "salt-n-pepper noise filtering using cellular automata. cellular automata (ca) considered one pronounced parallel computational tools recent era nature bio-inspired computing. taking advantage local connectivity, simplicity design inherent parallelism, ca effectively applied many image processing tasks. paper, ca approach efficient salt-n-pepper noise filtering grayscale images presented. using 2d moore neighborhood, classified ""noisy"" cells corrected averaging non-noisy neighboring cells. keeping computational burden really low, proposed approach succeeds removing high-noise levels various images yields promising qualitative quantitative results, compared state-of-the-art techniques.",4 "semi-supervised active learning support vector machines: novel approach exploits structure information data. today's information society data emerges, e.g.~in social networks, technical applications, business applications. companies try commercialize data using data mining machine learning methods. purpose, data categorized classified, often high (monetary temporal) costs. effective approach reduce costs apply kind active learning (al) methods, al controls training process classifier specific querying individual data points (samples), labeled (e.g., provided class memberships) domain expert. however, analysis current al research shows al still shortcomings. particular, structure information given spatial pattern (un)labeled data input space classification model (e.g.,~cluster information), used insufficient way. addition, many existing al techniques pay little attention practical applicability. meet challenges, article presents several techniques together build new approach combining al semi-supervised learning (ssl) support vector machines (svm) classification tasks. structure information captured means probabilistic models iteratively improved runtime label information becomes available. probabilistic models considered selection strategy based distance, density, diversity, distribution (4ds strategy) information al kernel function (responsibility weighted mahalanobis kernel) svm. approach fuses generative discriminative modeling techniques. 20 benchmark data sets mnist data set shown new solution yields significantly better results state-of-the-art methods.",19 "crosscat: fully bayesian nonparametric method analyzing heterogeneous, high dimensional data. widespread need statistical methods analyze high-dimensional datasets with- imposing restrictive opaque modeling assumptions. paper describes domain-general data analysis method called crosscat. crosscat infers multiple non-overlapping views data, consisting subset variables, uses separate nonparametric mixture model view. crosscat based approximately bayesian inference hierarchical, nonparamet- ric model data tables. model consists dirichlet process mixture columns data table mixture component independent dirichlet process mixture rows; inner mixture components simple parametric models whose form depends types data table. crosscat combines strengths mixture modeling bayesian net- work structure learning. like mixture modeling, crosscat model broad class distributions positing latent variables, produces representations efficiently conditioned sampled prediction. like bayesian networks, crosscat represents dependencies independencies variables, thus remains accurate multiple statistical signals. inference done via scalable gibbs sampling scheme; paper shows works well practice. paper also includes empirical results heterogeneous tabular data 10 million cells, hospital cost quality measures, voting records, unemployment rates, gene expression measurements, images handwritten digits. crosscat infers structure consistent accepted findings common-sense knowledge multiple domains yields predictive accuracy competitive generative, discriminative, model-free alternatives.",4 "active learning imperfect labelers. study active learning labeler return incorrect labels also abstain labeling. consider different noise abstention conditions labeler. propose algorithm utilizes abstention responses, analyze statistical consistency query complexity fairly natural assumptions noise abstention rate labeler. algorithm adaptive sense automatically request less queries informed less noisy labeler. couple algorithm lower bounds show technical conditions, achieves nearly optimal query complexity.",4 "inconsistency $\ell_1$-penalised sparse precision matrix estimation. various $\ell_1$-penalised estimation methods graphical lasso clime widely used sparse precision matrix estimation. many methods shown consistent various quantitative assumptions underlying true covariance matrix. intuitively, conditions related situations penalty term dominate optimisation. paper, explore consistency $\ell_1$-based methods class sparse latent variable -like models, strongly motivated several types applications. show $\ell_1$-based methods fail dramatically models nearly linear dependencies variables. also study consistency models derived real gene expression data note assumptions needed consistency never hold even modest sized gene networks $\ell_1$-based methods also become unreliable practice larger networks.",4 "neutrosophic description logic. description logics (dls) appropriate, widely used, logics managing structured knowledge. allow reasoning individuals concepts, i.e. set individuals common properties. typically, dls limited dealing crisp, well defined concepts. is, concepts problem whether individual instance yes/no question. often not, concepts encountered real world precisely defined criteria membership: may say individual instance concept certain degree, depending individual's properties. dls deal fuzzy concepts called fuzzy dls. order deal fuzzy, incomplete, indeterminate inconsistent concepts, need extend fuzzy dls, combining neutrosophic logic classical dl. particular, concepts become neutrosophic (here neutrosophic means fuzzy, incomplete, indeterminate, inconsistent), thus reasoning neutrosophic concepts supported. we'll define syntax, semantics, describe properties.",4 "harnessing natural fluctuations: analogue computer efficient socially maximal decision making. individual handles many tasks finding profitable option set options stochastically provide rewards. society comprises collection individuals, society expected maximise total rewards, individuals compete common rewards. collective decision making formulated `competitive multi-armed bandit problem (cbp)', requiring huge computational cost. herein, demonstrate prototype analog computer efficiently solves cbps exploiting physical dynamics numerous fluids coupled cylinders. device enables maximisation total rewards society without paying conventionally required computational cost; fluids estimate reward probabilities options exploitation past knowledge generate random fluctuations exploration new knowledge. results suggest optimise social rewards, utilisation fluid-derived natural fluctuations advantageous applying artificial external fluctuations. analog computing scheme expected trigger studies harnessing huge computational power natural phenomena resolving wide variety complex problems modern information society.",4 tighter variational representations f-divergences via restriction probability measures. show variational representations f-divergences currently used literature tightened. implications number methods recently proposed based representation. example application use tighter representation derive general f-divergence estimator based two i.i.d. samples derive dual program estimator performs well empirically. also point connection estimator mmd.,4 "fusionnet: deep fully residual convolutional neural network image segmentation connectomics. electron microscopic connectomics ambitious research direction goal studying comprehensive brain connectivity maps using high-throughput, nano-scale microscopy. one main challenges connectomics research developing scalable image analysis algorithms require minimal user intervention. recently, deep learning drawn much attention computer vision exceptional performance image classification tasks. reason, application connectomic analyses holds great promise, well. paper, introduce novel deep neural network architecture, fusionnet, automatic segmentation neuronal structures connectomics data. fusionnet leverages latest advances machine learning, semantic segmentation residual neural networks, novel introduction summation-based skip connections allow much deeper network architecture accurate segmentation. demonstrate performance proposed method comparing state-of-the-art electron microscopy (em) segmentation methods isbi em segmentation challenge. also show segmentation results two different tasks including cell membrane cell body segmentation statistical analysis cell morphology.",4 "aggregated wasserstein metric state registration hidden markov models. propose framework, named aggregated wasserstein, computing dissimilarity measure distance two hidden markov models state conditional distributions gaussian. hmms, marginal distribution time position follows gaussian mixture distribution, fact exploited softly match, aka register, states two hmms. refer hmms gaussian mixture model-hmm (gmm-hmm). registration states inspired intrinsic relationship optimal transport wasserstein metric distributions. specifically, components marginal gmms matched solving optimal transport problem cost components wasserstein metric gaussian distributions. solution optimization problem fast approximation wasserstein metric two gmms. new aggregated wasserstein distance semi-metric computed without generating monte carlo samples. invariant relabeling permutation states. distance defined meaningfully even two hmms estimated data different dimensionality, situation arise due missing variables. distance quantifies dissimilarity gmm-hmms measuring difference two marginal gmms two transition matrices. new distance tested tasks retrieval, classification, t-sne visualization time series. experiments synthetic real data demonstrated advantages terms accuracy well efficiency comparison existing distances based kullback-leibler divergence.",4 "accelerated mini-batch stochastic dual coordinate ascent. stochastic dual coordinate ascent (sdca) effective technique solving regularized loss minimization problems machine learning. paper considers extension sdca mini-batch setting often used practice. main contribution introduce accelerated mini-batch version sdca prove fast convergence rate method. discuss implementation method parallel computing system, compare results vanilla stochastic dual coordinate ascent accelerated deterministic gradient descent method \cite{nesterov2007gradient}.",19 "state transition algorithm. terms concepts state state transition, new heuristic random search algorithm named state transition algorithm proposed. continuous function optimization problems, four special transformation operators called rotation, translation, expansion axesion designed. adjusting measures transformations mainly studied keep balance exploration exploitation. convergence analysis also discussed algorithm based random search theory. meanwhile, strengthen search ability high dimensional space, communication strategy introduced basic algorithm intermittent exchange presented prevent premature convergence. finally, experiments carried algorithms. 10 common benchmark unconstrained continuous functions used test performance, results show state transition algorithms promising algorithms due good global search capability convergence property compared popular algorithms.",12 "coping construals broad-coverage semantic annotation adpositions. consider semantics prepositions, revisiting broad-coverage annotation scheme used annotating 4,250 preposition tokens 55,000 word corpus english. attempts apply scheme adpositions case markers languages, well problematic cases english, led us reconsider assumption preposition's lexical contribution equivalent role/relation mediates. proposal embrace potential construal adposition use, expressing phenomena directly token level manage complexity avoid sense proliferation. suggest framework represent scene role adposition's lexical function annotated scale---supporting automatic, statistical processing domain-general language---and sketch representation would inform constructional analysis.",4 "neural end-to-end learning computational argumentation mining. investigate neural techniques end-to-end computational argumentation mining (am). frame token-based dependency parsing token-based sequence tagging problem, including multi-task learning setup. contrary models operate argument component level, find framing dependency parsing leads subpar performance results. contrast, less complex (local) tagging models based bilstms perform robustly across classification scenarios, able catch long-range dependencies inherent problem. moreover, find jointly learning 'natural' subtasks, multi-task learning setup, improves performance.",4 "community identity user engagement multi-community landscape. community's identity defines shapes internal dynamics. current understanding interplay mostly limited glimpses gathered isolated studies individual communities. work provide systematic exploration nature relation across wide variety online communities. end introduce quantitative, language-based typology reflecting two key aspects community's identity: distinctive, temporally dynamic is. mapping almost 300 reddit communities landscape induced typology, reveal regularities patterns user engagement vary characteristics community. results suggest way new existing users engage community depends strongly systematically nature collective identity fosters, ways highly consequential community maintainers. example, communities distinctive highly dynamic identities likely retain users. however, niche communities also exhibit much larger acculturation gaps existing users newcomers, potentially hinder integration latter. generally, methodology reveals differences various social phenomena manifest across communities, shows structuring multi-community landscape lead better understanding systematic nature diversity.",4 "learning interpretable musical compositional rules traces. throughout music history, theorists identified documented interpretable rules capture decisions composers. paper asks, ""can machine behave like music theorist?"" presents mus-rover, self-learning system automatically discovering rules symbolic music. mus-rover performs feature learning via $n$-gram models extract compositional rules --- statistical patterns resulting features. evaluate mus-rover bach's (satb) chorales, demonstrating recover known rules, well identify new, characteristic patterns study. discuss extracted rules used machine human composition.",19 "extended depth-of-field holographic image reconstruction using deep learning based auto-focusing phase-recovery. holography encodes three dimensional (3d) information sample form intensity-only recording. however, decode original sample image hologram(s), auto-focusing phase-recovery needed, general cumbersome time-consuming digitally perform. demonstrate convolutional neural network (cnn) based approach simultaneously performs auto-focusing phase-recovery significantly extend depth-of-field (dof) holographic image reconstruction. this, cnn trained using pairs randomly de-focused back-propagated holograms corresponding in-focus phase-recovered images. training phase, cnn takes single back-propagated hologram 3d sample input rapidly achieve phase-recovery reconstruct focus image sample significantly extended dof. deep learning based dof extension method non-iterative, significantly improves algorithm time-complexity holographic image reconstruction o(nm) o(1), n refers number individual object points particles within sample volume, represents focusing search space within object point particle needs individually focused. results highlight unique opportunities created data-enabled statistical image reconstruction methods powered machine learning, believe presented approach broadly applicable computationally extend dof imaging modalities.",4 "joint dictionaries zero-shot learning. classic approach toward zero-shot learning (zsl) map input domain set semantically meaningful attributes could used later classify unseen classes data (e.g. visual data). paper, propose learn visual feature dictionary semantically meaningful atoms. dictionary learned via joint dictionary learning visual domain attribute domain, enforcing sparse coding dictionaries. novel attribute aware formulation provides algorithmic solution domain shift/hubness problem zsl. upon learning joint dictionaries, images unseen classes mapped attribute space finding attribute aware joint sparse representation using solely visual data. demonstrate approach provides superior comparable performance state art benchmark datasets.",4 "adversarial attacks beyond image space. generating adversarial examples intriguing problem important way understanding working mechanism deep neural networks. recently, attracted lot attention computer vision community. existing approaches generated perturbations image space, i.e., pixel modified independently. however, remains unclear whether adversarial examples authentic, sense correspond actual changes physical properties. paper aims exploring topic contexts object classification visual question answering. baselines set several state-of-the-art deep neural networks receive 2d input images. augment networks differentiable 3d rendering layer front, 3d scene (in physical space) rendered 2d image (in image space), mapped prediction (in output space). two (direct indirect) ways attacking physical parameters. former back-propagates gradients error signals output space physical space directly, latter first constructs adversary image space, attempts find best solution physical space rendered image. important finding attacking physical space much difficult, direct method, compared used image space, produces much lower success rate requires heavier perturbations added. hand, indirect method work out, suggesting adversaries generated image space inauthentic. interpreting physical space, adversaries filtered out, showing promise defending adversaries.",4 "clustering belief functions based attracting conflicting metalevel evidence. paper develop method clustering belief functions based attracting conflicting metalevel evidence. clustering done belief functions concern multiple events, belief functions mixed up. clustering process used means separating belief functions subsets handled independently. conflicting metalevel evidence generated internally pairwise conflicts belief functions, attracting metalevel evidence assumed given external source.",4 "computerized tomography total variation shearlets. reduce x-ray dose computerized tomography (ct), many constrained optimization approaches proposed aiming minimizing regularizing function measures lack consistency prior knowledge object imaged, subject (predetermined) level consistency detected attenuation x-rays. proponents shearlet transform regularizing function claim reconstructions obtained better produced using tv texture preservation (but may worse noise reduction). paper report results related claim. reported experiments using simulated ct data collection head, reconstructions whose shearlet transform small $\ell_1$-norm efficacious reconstructions small tv value. experiments making comparisons use recently-developed superiorization methodology regularizing functions. superiorization automated procedure turning iterative algorithm producing images satisfy primary criterion (such consistency observed measurements) superiorized version produce results that, according primary criterion good produced original algorithm, addition superior according secondary (regularizing) criterion. method presented superiorization involving $\ell_1$-norm shearlet transform novel quite general: used regularizing function defined $\ell_1$-norm transform specified application matrix. previous literature split bregman algorithm used similar purposes, section included comparing results superiorization algorithm split bregman algorithm.",15 "designing deep convolutional neural networks continuous object orientation estimation. deep convolutional neural networks (dcnn) proven effective various computer vision problems. work, demonstrate effectiveness continuous object orientation estimation task, requires prediction 0 360 degrees orientation objects. proposing comparing three continuous orientation prediction approaches designed dcnns. first two approaches work representing orientation point unit circle minimizing either l2 loss angular difference loss. third method works first converting continuous orientation estimation task set discrete orientation estimation tasks converting discrete orientation outputs back continuous orientation using mean-shift algorithm. evaluating vehicle orientation estimation task pedestrian orientation estimation task, demonstrate discretization-based approach works better two approaches also achieves state-of-the-art performance. also demonstrate finding appropriate feature representation critical achieve good performance adapting dcnn trained image recognition task.",4 "exploiting spatio-temporal structure recurrent winner-take-all networks. propose convolutional recurrent neural network, winner-take-all dropout high dimensional unsupervised feature learning multi-dimensional time series. apply proposedmethod object recognition temporal context videos obtain better results comparable methods literature, including deep predictive coding networks previously proposed chalasani principe.our contributions summarized scalable reinterpretation deep predictive coding networks trained end-to-end backpropagation time, extension previously proposed winner-take-all autoencoders sequences time, new technique initializing regularizing convolutional-recurrent neural networks.",4 "deep contextualized word representations. introduce new type deep contextualized word representation models (1) complex characteristics word use (e.g., syntax semantics), (2) uses vary across linguistic contexts (i.e., model polysemy). word vectors learned functions internal states deep bidirectional language model (bilm), pre-trained large text corpus. show representations easily added existing models significantly improve state art across six challenging nlp problems, including question answering, textual entailment sentiment analysis. also present analysis showing exposing deep internals pre-trained network crucial, allowing downstream models mix different types semi-supervision signals.",4 "segmentation breast regions mammogram based density: review. focus paper review approaches segmentation breast regions mammograms according breast density. studies based density undertaken relationship breast cancer density. breast cancer usually occurs fibroglandular area breast tissue, appears bright mammograms described breast density. studies focused classification methods glandular tissue detection. others highlighted segmentation methods fibroglandular tissue, researchers performed segmentation breast anatomical regions based density. also works segmentation specific parts breast regions either detection nipple position, skin-air interface pectoral muscles. problems evaluation performance segmentation results relation ground truth also discussed paper.",4 "critical reassessment evolutionary algorithms cryptanalysis simplified data encryption standard algorithm. paper analyze cryptanalysis simplified data encryption standard algorithm using meta-heuristics particular genetic algorithms. classic fitness function using algorithm compare n-gram statistics decrypted message target message. show using function irrelevant case genetic algorithm, simply correlation distance real key (the optimum) value fitness, words, hidden gradient. order emphasize assumption experimentally show genetic algorithm perform worse random search cryptanalysis simplified data encryption standard algorithm.",4 "community structure industrial sat instances. modern sat solvers experienced remarkable progress solving industrial instances. techniques developed intensive experimental process. believed techniques exploit underlying structure industrial instances. however, works trying exactly characterize main features structure. research community complex networks developed techniques analysis algorithms study real-world graphs used sat community. recently, attempts analyze structure industrial sat instances terms complex networks, aim explaining success sat solving techniques, possibly improving them. paper, inspired results complex networks, study community structure, modularity, industrial sat instances. graph clear community structure, high modularity, find partition nodes communities edges connect variables community. analysis, represent sat instances graphs, show application benchmarks characterized high modularity. contrary, random sat instances closer classical erd\""os-r\'enyi random graph model, structure observed. also analyze structure evolves effects execution sat solver. detect new clauses learnt solver search contribute destroy original community structure formula. partially explains distinct performance sat solvers random industrial sat instances.",4 "unified method first third person action recognition. paper, new video classification methodology proposed applied first third person videos. main idea behind proposed strategy capture complementary information appearance motion efficiently performing two independent streams videos. first stream aimed capture long-term motions shorter ones keeping track elements optical flow images changed time. optical flow images described pre-trained networks trained large scale image datasets. set multi-channel time series obtained aligning descriptions beside other. extracting motion features time series, pot representation method plus novel pooling operator followed due several advantages. second stream accomplished extract appearance features vital case video classification. proposed method evaluated first third-person datasets results present proposed methodology reaches state art successfully.",4 "vision-based human gender recognition: survey. gender important demographic attribute people. paper provides survey human gender recognition computer vision. review approaches exploiting information face whole body (either still image gait sequence) presented. highlight challenges faced survey representative methods approaches. based results, good performance achieved datasets captured controlled environments, still much work done improve robustness gender recognition real-life environments.",4 "light field stitching extended synthetic aperture. capturing spatial angular radiance distribution, light field cameras introduce new capabilities possible conventional cameras. far light field imaging literature, focus theory applications single light field capture. combining multiple light fields, possible obtain new capabilities enhancements, even exceed physical limitations, spatial resolution aperture size imaging device. paper, present algorithm register stitch multiple light fields. utilize regularity spatial angular sampling light field data, extend techniques developed stereo vision systems light field data. extension straightforward micro-lens array (mla) based light field camera due extremely small baseline low spatial resolution. merging multiple light fields captured mla based camera, obtain larger synthetic aperture, results improvements light field capabilities, increased depth estimation range/accuracy wider perspective shift range.",4 "joint bayesian gaussian discriminant analysis speaker verification. state-of-the-art i-vector based speaker verification relies variants probabilistic linear discriminant analysis (plda) discriminant analysis. mainly motivated recent work joint bayesian (jb) method, originally proposed discriminant analysis face verification. apply jb speaker verification make three contributions beyond original jb. 1) contrast em iterations approximated statistics original jb, em iterations exact statistics employed give better performance. 2) propose simultaneous diagonalization (sd) within-class between-class covariance matrices achieve efficient testing, broader application scope svd-based efficient testing method original jb. 3) scrutinize similarities differences various gaussian pldas jb, complementing previous analysis comparing jb prince-elder plda. extensive experiments conducted nist sre10 core condition 5, empirically validating superiority jb faster convergence rate 9-13% eer reduction compared state-of-the-art plda.",4 "experimental survey correlation filter-based tracking. years, correlation filter-based trackers (cfts) aroused increasing interests field visual object tracking, achieved extremely compelling results different competitions benchmarks. paper, goal review developments cfts extensive experimental results. 11 trackers surveyed work, based general framework summarized. furthermore, investigate different training schemes correlation filters, also discuss various effective improvements made recently. comprehensive experiments conducted evaluate effectiveness efficiency surveyed cfts, comparisons made competing trackers. experimental results shown state-of-art performance, terms robustness, speed accuracy, achieved several recent cfts, muster samf. find improvements correlation filter-based tracking made estimating scales, applying part-based tracking strategy cooperating long-term tracking methods.",4 minds computable?. essay explores limits turing machines concerning modeling minds suggests alternatives go beyond limits.,4 "somz: photometric redshift pdfs self organizing maps random atlas. paper explore applicability unsupervised machine learning technique self organizing maps (som) estimate galaxy photometric redshift probability density functions (pdfs). technique takes spectroscopic training set, maps photometric attributes, redshifts, two dimensional surface using process competitive learning neurons compete closely resemble training data multidimensional space. key feature som retains topology input set, revealing correlations attributes easily identified. test three different 2d topological mapping: rectangular, hexagonal, spherical, using data deep2 survey. also explore different implementations boundary conditions map also introduce idea random atlas large number different maps created individual predictions aggregated produce robust photometric redshift pdf. also introduced new metric, $i$-score, efficiently incorporates different metrics, making easier compare different results (from different parameters different photometric redshift codes). find using spherical topology mapping obtain better representation underlying multidimensional topology, provides accurate results comparable other, state-of-the-art machine learning algorithms. results illustrate unsupervised approaches great potential many astronomical problems, particular computation photometric redshifts.",1 "active reinforcement learning monte-carlo tree search. active reinforcement learning (arl) twist rl agent observes reward information pays cost. subtle change makes exploration substantially challenging. powerful principles rl like optimism, thompson sampling, random exploration help arl. relate arl tabular environments bayes-adaptive mdps. provide arl algorithm using monte-carlo tree search asymptotically bayes optimal. experimentally, algorithm near-optimal small bandit problems mdps. larger mdps outperforms q-learner augmented specialised heuristics arl. analysing exploration behaviour detail, uncover obstacles scaling simulation-based algorithms arl.",4 "semi-supervised generation cluster-aware generative models. deep generative models trained large amounts unlabelled data proven powerful within domain unsupervised learning. many real life data sets contain small amount labelled data points, typically disregarded training generative models. propose cluster-aware generative model, uses unlabelled information infer latent representation models natural clustering data, additional labelled data points refine clustering. generative performances model significantly improve labelled information exploited, obtaining log-likelihood -79.38 nats permutation invariant mnist, also achieving competitive semi-supervised classification accuracies. model also trained fully unsupervised, still improve log-likelihood performance respect related methods.",19 "novel approach dropped pronoun translation. dropped pronouns (dp) pronouns frequently dropped source language retained target language challenge machine translation. response problem, propose semi-supervised approach recall possibly missing pronouns translation. firstly, build training data dp generation dps automatically labelled according alignment information parallel corpus. secondly, build deep learning-based dp generator input sentences decoding corresponding references exist. specifically, generation two-phase: (1) dp position detection, modeled sequential labelling task recurrent neural networks; (2) dp prediction, employs multilayer perceptron rich features. finally, integrate outputs translation system recall missing pronouns extracting rules dp-labelled training data translating dp-generated input sentences. experimental results show approach achieves significant improvement 1.58 bleu points translation performance 66% f-score dp generation accuracy.",4 "post-reconstruction deconvolution pet images total generalized variation regularization. improving quality positron emission tomography (pet) images, affected low resolution high level noise, challenging task nuclear medicine radiotherapy. work proposes restoration method, achieved tomographic reconstruction images targeting clinical situations raw data often accessible. based inverse problem methods, contribution introduces recently developed total generalized variation (tgv) norm regularize pet image deconvolution. moreover, stabilize procedure additional image constraints positivity photometry invariance. criterion updating adjusting automatically regularization parameter case poisson noise also presented. experiments conducted synthetic data real patient images.",4 "multi-task learning continuous control. reliable effective multi-task learning prerequisite development robotic agents quickly learn accomplish related, everyday tasks. however, reinforcement learning domain, multi-task learning exhibited level success domains, computer vision. addition, reinforcement learning research multi-task learning focused discrete action spaces, used robotic control real-world. work, apply multi-task learning methods continuous action spaces benchmark performance series simulated continuous control tasks. notably, show multi-task learning outperforms baselines alternative knowledge sharing methods.",4 "neuralpower: predict deploy energy-efficient convolutional neural networks. ""how much energy consumed inference made convolutional neural network (cnn)?"" increased popularity cnns deployed wide-spectrum platforms (from mobile devices workstations), answer question drawn significant attention. lengthening battery life mobile devices reducing energy bill datacenter, important understand energy efficiency cnns serving making inference, actually training model. work, propose neuralpower: layer-wise predictive framework based sparse polynomial regression, predicting serving energy consumption cnn deployed gpu platform. given architecture cnn, neuralpower provides accurate prediction breakdown power runtime across layers whole network, helping machine learners quickly identify power, runtime, energy bottlenecks. also propose ""energy-precision ratio"" (epr) metric guide machine learners selecting energy-efficient cnn architecture better trades energy consumption prediction accuracy. experimental results show prediction accuracy proposed neuralpower outperforms best published model date, yielding improvement accuracy 68.5%. also assess accuracy predictions network level, predicting runtime, power, energy state-of-the-art cnn architectures, achieving average accuracy 88.24% runtime, 88.34% power, 97.21% energy. comprehensively corroborate effectiveness neuralpower powerful framework machine learners testing different gpu platforms deep learning software tools.",4 "boundary crossing probabilities general exponential families. consider parametric exponential families dimension $k$ real line. study variant \textit{boundary crossing probabilities} coming multi-armed bandit literature, case real-valued distributions form exponential family dimension $k$. formally, result concentration inequality bounds probability $\mathcal{b}^\psi(\hat \theta_n,\theta^\star)\geq f(t/n)/n$, $\theta^\star$ parameter unknown target distribution, $\hat \theta_n$ empirical parameter estimate built $n$ observations, $\psi$ log-partition function exponential family $\mathcal{b}^\psi$ corresponding bregman divergence. perspective stochastic multi-armed bandits, pay special attention case boundary function $f$ logarithmic, enables analyze regret state-of-the-art \klucb\ \klucbp\ strategies, whose analysis left open generality. indeed, previous results hold case $k=1$, provide results arbitrary finite dimension $k$, thus considerably extending existing results. perhaps surprisingly, highlight proof techniques achieve strong results already existed three decades ago work t.l. lai, apparently forgotten bandit community. provide modern rewriting beautiful techniques believe useful beyond application stochastic multi-armed bandits.",19 "motion compensated dynamic mri reconstruction local affine optical flow estimation. paper proposes novel framework reconstruct dynamic magnetic resonance images (dmri) motion compensation (mc). due inherent motion effects dmri acquisition, reconstruction dmri using motion estimation/compensation (me/mc) studied compressed sensing (cs) scheme. paper, embedding intensity-based optical flow (of) constraint traditional cs scheme, able couple dmri reconstruction motion field estimation. formulated optimization problem solved primal-dual algorithm linesearch due efficiency dealing non-differentiable problems. estimated motion field, dmri reconstruction refined mc. employing multi-scale coarse-to-fine strategy, able update variables(temporal image sequences motion vectors) refine image reconstruction alternately. moreover, proposed framework capable handling wide class prior information (regularizations) dmri reconstruction, sparsity, low rank total variation. experiments various dmri data, ranging vivo lung cardiac dataset, validate reconstruction quality improvement using proposed scheme comparison several state-of-the-art algorithms.",4 "real-time document image classification using deep cnn extreme learning machines. paper presents approach real-time training testing document image classification. production environments, crucial perform accurate (time-)efficient training. existing deep learning approaches classifying documents meet requirements, require much time training fine-tuning deep architectures. motivated computer vision, propose two-stage approach. first stage trains deep network works feature extractor second stage, extreme learning machines (elms) used classification. proposed approach outperforms previously reported structural deep learning based methods final accuracy 83.24% tobacco-3482 dataset, leading relative error reduction 25% compared previous convolutional neural network (cnn) based approach (deepdocclassifier). importantly, training time elm 1.176 seconds overall prediction time 2,482 images 3.066 seconds. such, novel approach makes deep learning-based document classification suitable large-scale real-time applications.",4 "transmitting signal amplitude modulation chaotic network. discuss ability network non linear relays chaotic dynamics transmit signals, basis linear response theory developed ruelle \cite{ruelle} dissipative systems. show particular dynamics interfere graph topology produce effective transmission network, whose topology depends signal, cannot directly read ``wired'' network. leads one reconsider notions ``hubs''. then, show examples where, suitable choice carrier frequency (resonance), one transmit signal node another one amplitude modulation, \textit{in spite chaos}. also, give example signal, transmitted node via different paths, recovered couple \textit{specific} nodes. opens possibility encoding data way recovery signal requires knowledge carrier frequency \textit{and} performed specific node.",13 "text speech (tts) system english punjabi conversion. paper aims show application developed converts english language punjabi language, application convert text speech(tts) i.e. pronounce text. application really beneficial special needs.",4 "lyapunov exponents adversarial perturbation. paper, would like disseminate serendipitous discovery involving lyapunov exponents 1-d time series use serving filtering defense tool specific kind deep adversarial perturbation. end, use state-of-the-art cleverhans library generate adversarial perturbations standard convolutional neural network (cnn) architecture trained mnist well fashion-mnist datasets. empirically demonstrate lyapunov exponents computed flattened 1-d vector representations images served highly discriminative features could pre-classify images adversarial legitimate feeding image cnn classification. also explore issue possible false-alarms input images noisy non-adversarial sense.",4 "toward deep neural approach knowledge-based ir. paper tackles problem semantic gap document query within ad-hoc information retrieval task. context, knowledge bases (kbs) already acknowledged valuable means since allow representation explicit relations entities. however, necessarily represent implicit relations could hidden corpora. latter issue tackled recent works dealing deep representation learn ing texts. mind, argue embedding kbs within deep neural architectures supporting documentquery matching would give rise fine-grained latent representations words semantic relations. paper, review main approaches neural-based document ranking well approaches latent representation entities relations via kbs. propose avenues incorporate kbs deep neural approaches document ranking. particularly, paper advocates kbs used either support enhanced latent representations queries documents based distributional relational semantics serve semantic translator latent distributional representations.",4 "generating sentiment lexicons german twitter. despite substantial progress made developing new sentiment lexicon generation (slg) methods english, task transferring approaches languages domains sound way still remains open. paper, contribute solution problem systematically comparing semi-automatic translations common english polarity lists results original automatic slg algorithms, applied directly german data. evaluate lexicons corpus 7,992 manually annotated tweets. addition that, also collate results dictionary- corpus-based slg methods order find paradigms better suited inherently noisy domain social media. experiments show semi-automatic translations notably outperform automatic systems (reaching macro-averaged f1-score 0.589), dictionary-based techniques produce much better polarity lists compared corpus-based approaches (whose best f1-scores run 0.479 0.419 respectively) even non-standard twitter genre.",4 "multilingual open relation extraction using cross-lingual projection. open domain relation extraction systems identify relation argument phrases sentence without relying underlying schema. however, current state-of-the-art relation extraction systems available english heavy reliance linguistic tools part-of-speech taggers dependency parsers. present cross-lingual annotation projection method language independent relation extraction. evaluate method manually annotated test set present results three typologically different languages. release manual annotations extracted relations 61 languages wikipedia.",4 "stratified labelings abstract argumentation. introduce stratified labelings novel semantical approach abstract argumentation frameworks. compared standard labelings, stratified labelings provide fine-grained assessment controversiality arguments using ranks instead usual labels in, out, undecided. relate framework stratified labelings conditional logic and, particular, system z ranking functions.",4 "multi-device, multi-tenant model selection gp-ei. bayesian optimization core technique behind emergence automl, holds promise automatically searching models hyperparameters make machine learning techniques accessible. services moving towards cloud, ask -- {\em multiple automl users share computational infrastructure, allocate resources maximize ""global happiness"" users?} focus gp-ei, one popular algorithms automatic model selection hyperparameter tuning, develop novel multi-device, multi-tenant extension aware \emph{multiple} computation devices multiple users sharing set computation devices. theoretically, given $n$ users $m$ devices, obtain regret bound $o((\text{\bf {miu}}(t,k) + m)\frac{n^2}{m})$, $\text{\bf {miu}}(t,k)$ refers maximal incremental uncertainty time $t$ covariance matrix $k$. empirically, evaluate algorithm two applications automatic model selection, show algorithm significantly outperforms strategy serving users independently. moreover, multiple computation devices available, achieve near-linear speedup number users much larger number devices.",4 identification complex systems basis wavelets. paper proposed method identification complex dynamic systems. method used identification linear nonlinear complex dynamic systems determined stochastic signals inputs outputs. proposed use basis wavelets obtaining impulse transient function (itf) system. itf considered form surface 3d space. given results experiments identification systems basis wavelets.,4 "learning extract semantic structure documents using multimodal fully convolutional neural network. present end-to-end, multimodal, fully convolutional network extracting semantic structures document images. consider document semantic structure extraction pixel-wise segmentation task, propose unified model classifies pixels based visual appearance, traditional page segmentation task, also content underlying text. moreover, propose efficient synthetic document generation process use generate pretraining data network. network trained large set synthetic documents, fine-tune network unlabeled real documents using semi-supervised approach. systematically study optimum network architecture show multimodal approach synthetic data pretraining significantly boost performance.",4 "optimally training cascade classifier. cascade classifiers widely used real-time object detection. different conventional classifiers designed low overall classification error rate, classifier node cascade required achieve extremely high detection rate moderate false positive rate. although reported methods addressing requirement context object detection, principled feature selection method explicitly takes account asymmetric node learning objective. provide algorithm here. show special case biased minimax probability machine formulation linear asymmetric classifier (lac) \cite{wu2005linear}. design new boosting algorithm directly optimizes cost function lac. resulting totally-corrective boosting algorithm implemented column generation technique convex optimization. experimental results object detection verify effectiveness proposed boosting algorithm node classifier cascade object detection, show performance better current state-of-the-art.",4 "modeling human categorization natural images using deep feature representations. last decades, psychologists developed sophisticated formal models human categorization using simple artificial stimuli. paper, use modern machine learning methods extend work realm naturalistic stimuli, enabling human categorization studied complex visual domain evolved developed. show representations derived convolutional neural network used model behavior database >300,000 human natural image classifications, find group models based representations perform well, near reliability human judgments. interestingly, group includes exemplar prototype models, contrasting dominance exemplar models previous work. able improve performance remaining models preprocessing neural network representations closely capture human similarity judgments.",4 "general video game ai: learning screen capture. general video game artificial intelligence general game playing framework artificial general intelligence research video-games domain. paper, propose first time screen capture learning agent general video game ai framework. deep q-network algorithm applied improved develop agent capable learning play different games framework. testing algorithm using various games different categories difficulty levels, results suggest proposed screen capture learning agent potential learn many different games using single learning algorithm.",4 "unsupervised domain adaptation meets tensor representations. domain adaption (da) allows machine learning methods trained data sampled one distribution applied data sampled another. thus great practical importance application methods. despite fact tensor representations widely used computer vision capture multi-linear relationships affect data, existing da methods applicable vectors only. renders incapable reflecting preserving important structure many problems. thus propose learning-based method adapt source target tensor representations directly, without vectorization. particular, set alignment matrices introduced align tensor representations domains invariant tensor subspace. alignment matrices tensor subspace modeled joint optimization problem learned adaptively data using proposed alternative minimization scheme. extensive experiments show approach capable preserving discriminative power source domain, resisting effects label noise, works effectively small sample sizes, even one-shot da. show method outperforms state-of-the-art task cross-domain visual recognition efficacy efficiency, particularly outperforms comparators applied da convolutional activations deep convolutional networks.",4 "spook: system probabilistic object-oriented knowledge representation. previous work, pointed limitations standard bayesian networks modeling framework large, complex domains. proposed new, richly structured modeling language, {em object-oriented bayesian netorks}, argued would able deal domains. however, turns oobns expressive enough model many interesting aspects complex domains: existence specific named objects, arbitrary relations objects, uncertainty domain structure. aspects crucial real-world domains battlefield awareness. paper, present spook, implemented system addresses limitations. spook implements expressive language allows represent battlespace domain naturally compactly. present new inference algorithm utilizes model structure fundamental way, show empirically achieves orders magnitude speedup existing approaches.",4 "approximate judgement aggregation. paper analyze judgement aggregation problems group agents independently votes set complex propositions interdependency constraint them(e.g., transitivity describing preferences). consider issue judgement aggregation perspective approximation. is, generalize previous results studying approximate judgement aggregation. relax main two constraints assumed current literature, consistency independence consider mechanisms approximately satisfy constraints, is, satisfy small portion inputs. main question raise whether relaxation notions significantly alters class satisfying aggregation mechanisms. recent works preference aggregation kalai, mossel, keller fit framework. main result paper that, case preference aggregation, case subclass natural class aggregation problems termed `truth-functional agendas', set satisfying aggregation mechanisms extend non-trivially relaxing constraints. proof techniques involve boolean fourier transform analysis voter influences voting protocols. question raise approximate aggregation stated terms property testing. instance, corollary result get generalization classic result property testing linearity boolean functions. updated version (repec:huj:dispap:dp574r) available http://www.ratio.huji.ac.il/dp_files/dp574r.pdf",4 "review bilevel optimization: classical evolutionary approaches applications. bilevel optimization defined mathematical program, optimization problem contains another optimization problem constraint. problems received significant attention mathematical programming community. limited work exists bilevel problems using evolutionary computation techniques; however, recently increasing interest due proliferation practical applications potential evolutionary algorithms tackling problems. paper provides comprehensive review bilevel optimization basic principles solution strategies; classical evolutionary. number potential application problems also discussed. offer readers insights prominent developments field bilevel optimization, performed automated text-analysis extended list papers published bilevel optimization date. paper motivate evolutionary computation researchers pay attention practical yet challenging area.",12 "using qualitative relationships bounding probability distributions. exploit qualitative probabilistic relationships among variables computing bounds conditional probability distributions interest bayesian networks. using signs qualitative relationships, implement abstraction operations guaranteed bound distributions interest desired direction. evaluating incrementally improved approximate networks, algorithm obtains monotonically tightening bounds converge exact distributions. supermodular utility functions, tightening bounds monotonically reduce set admissible decision alternatives well.",4 "success probability exploration: concrete analysis learning efficiency. exploration crucial part reinforcement learning, yet several important questions concerning exploration efficiency still answered satisfactorily existing analytical frameworks. questions include exploration parameter setting, situation analysis, hardness mdps, unavoidable practitioners. bridge gap theory practice, propose new analytical framework called success probability exploration. show important questions exploration answered framework, answers provided framework meet needs practitioners better existing ones. importantly, introduce concrete practical approach evaluating success probabilities certain mdps without need actually running learning algorithm. provide empirical results verify approach, demonstrate success probability exploration used analyse predict behaviours possible outcomes exploration, keys answer important questions exploration.",4 "deeptype: multilingual entity linking neural type system evolution. wealth structured (e.g. wikidata) unstructured data world available today presents incredible opportunity tomorrow's artificial intelligence. far, integration two different modalities difficult process, involving many decisions concerning best represent information captured useful, hand-labeling large amounts data. deeptype overcomes challenge explicitly integrating symbolic information reasoning process neural network type system. first construct type system, second, use constrain outputs neural network respect symbolic structure. achieve reformulating design problem mixed integer problem: create type system subsequently train neural network it. reformulation discrete variables select parent-child relations ontology types within type system, continuous variables control classifier fit type system. original problem cannot solved exactly, propose 2-step algorithm: 1) heuristic search stochastic optimization discrete variables define type system informed oracle learnability heuristic, 2) gradient descent fit classifier parameters. apply deeptype problem entity linking three standard datasets (i.e. wikidisamb30, conll (yago), tac kbp 2010) find outperforms existing solutions wide margin, including approaches rely human-designed type system recent deep learning-based entity embeddings, explicitly using symbolic information lets integrate new entities without retraining.",4 "bayesian forecasting www traffic time varying poisson model. traffic forecasting past observed traffic data small calculation complexity one important problems planning servers networks. focusing world wide web (www) traffic fundamental investigation, paper would deal bayesian forecasting network traffic time varying poisson model viewpoint statistical decision theory. model, would show estimated forecasting value obtained simple arithmetic calculation expresses real www traffic well theoretical empirical points view.",4 "symmetry breaking polynomial delay. conservative class constraint satisfaction problems csps class membership preserved arbitrary domain reductions. many well-known tractable classes csps conservative. well known lexleader constraints may significantly reduce number solutions excluding symmetric solutions csps. show adding certain lexleader constraints instance conservative class csps still allows us find solutions time polynomial successive solutions. time polynomial total size instance additional lexleader constraints. well known complete symmetry breaking one may need exponential number lexleader constraints. however, practice, number additional lexleader constraints typically polynomial number size instance. polynomially many lexleader constraints, may general complete symmetry breaking polynomially many lexleader constraints may provide practically useful symmetry breaking -- sometimes exclude super-exponentially many solutions. prove instance conservative class, time finding successive solutions instance polynomially many additional lexleader constraints polynomial even size instance without lexleaderconstraints.",4 "minimax lower bounds ridge combinations including neural nets. estimation functions $ $ variables considered using ridge combinations form $ \textstyle\sum_{k=1}^m c_{1,k} \phi(\textstyle\sum_{j=1}^d c_{0,j,k}x_j-b_k) $ activation function $ \phi $ function bounded value derivative. include single-hidden layer neural networks, polynomials, sinusoidal models. sample size $ n $ possibly noisy values random sites $ x \in b = [-1,1]^d $, minimax mean square error examined functions closure $ \ell_1 $ hull ridge functions activation $ \phi $. shown order $ d/n $ fractional power (when $ $ smaller order $ n $), order $ (\log d)/n $ fractional power (when $ $ larger order $ n $). dependence constraints $ v_0 $ $ v_1 $ $ \ell_1 $ norms inner parameter $ c_0 $ outer parameter $ c_1 $, respectively, also examined. also, lower upper bounds fractional power given. heart analysis development information-theoretic packing numbers classes functions.",19 simplified description fuzzy topsis. simplified description fuzzy topsis (technique order preference similarity ideal situation) presented. adapted topsis description existing fuzzy theory literature distilled bare minimum concepts required understanding applying topsis. example worked illustrate application topsis multi-criteria group decision making scenario.,4 "hybrid ps-v technique: novel sensor fusion approach fast mobile eye-tracking sensor-shift aware correction. paper introduces evaluates hybrid technique fuses efficiently eye-tracking principles photosensor oculography (psog) video oculography (vog). main concept novel approach use fast power-economic photosensors core mechanism performing high speed eye-tracking, whereas parallel, use video sensor operating low sampling-rate (snapshot mode) perform dead-reckoning error correction sensor movements occur. order evaluate proposed method, simulate functional components technique present results experimental scenarios involving various combinations horizontal vertical eye sensor movements. evaluation shows developed technique used provide robustness sensor shifts otherwise could induce error larger 5 deg. analysis suggests technique potentially enable high speed eye-tracking low power profiles, making suitable used emerging head-mounted devices, e.g. ar/vr headsets.",4 "black-box complexities combinatorial problems. black-box complexity complexity theoretic measure difficult problem optimized general purpose optimization algorithm. thus one means trying understand problems tractable genetic algorithms randomized search heuristics. previous work black-box complexity artificial test functions. paper, move step forward give detailed analysis two combinatorial problems minimum spanning tree single-source shortest paths. besides giving interesting bounds black-box complexities, work reveals choice model optimization problem non-trivial here. particular comes true search space consist bit strings reasonable definition unbiasedness agreed on.",4 "x-ray astronomical point sources recognition using granular binary-tree svm. study point sources astronomical images special importance, since energetic celestial objects universe exhibit point-like appearance. approach recognize point sources (ps) x-ray astronomical images using newly designed granular binary-tree support vector machine (gbt-svm) classifier proposed. first, potential point sources located peak detection image. image spectral features potential point sources extracted. finally, classifier recognize true point sources build extracted features. experiments applications approach real x-ray astronomical images demonstrated. comparisons approach svm-based classifiers also carried evaluating precision recall rates, prove approach better achieves higher accuracy around 89%.",4 "subspace-induced gaussian processes. present new gaussian process (gp) regression model covariance kernel indexed parameterized sufficient dimension reduction subspace reproducing kernel hilbert space. covariance kernel low-rank capturing statistical dependency response covariates, affords significant improvement computational efficiency well potential reduction variance predictions. develop fast expectation-maximization algorithm estimating parameters subspace-induced gaussian process (sigp). extensive results real data show sigp outperform standard full gp even low rank-$m$, $m\leq 3$, inducing subspace.",19 "gene expression programming: new adaptive algorithm solving problems. gene expression programming, genotype/phenotype genetic algorithm (linear ramified), presented first time new technique creation computer programs. gene expression programming uses character linear chromosomes composed genes structurally organized head tail. chromosomes function genome subjected modification means mutation, transposition, root transposition, gene transposition, gene recombination, one- two-point recombination. chromosomes encode expression trees object selection. creation separate entities (genome expression tree) distinct functions allows algorithm perform high efficiency greatly surpasses existing adaptive techniques. suite problems chosen illustrate power versatility gene expression programming includes symbolic regression, sequence induction without constant creation, block stacking, cellular automata rules density-classification problem, two problems boolean concept learning: 11-multiplexer gp rule problem.",4 "convergence rates inexact proximal-gradient methods convex optimization. consider problem optimizing sum smooth convex function non-smooth convex function using proximal-gradient methods, error present calculation gradient smooth term proximity operator respect non-smooth term. show basic proximal-gradient method accelerated proximal-gradient method achieve convergence rate error-free case, provided errors decrease appropriate rates.using rates, perform well better carefully chosen fixed error level set structured sparsity problems.",4 "deep pose consensus networks. paper, address problem estimating 3d human pose single image, important difficult solve due many reasons, self-occlusions, wild appearance changes, inherent ambiguities 3d estimation 2d cue. difficulties make problem ill-posed, become requiring increasingly complex estimators enhance performance. hand, existing methods try handle problem based single complex estimator, might good solutions. paper, resolve issue, propose multiple-partial-hypothesis-based framework problem estimating 3d human pose single image, fine-tuned end-to-end fashion. first select several joint groups human joint model using proposed sampling scheme, estimate 3d poses joint group separately based deep neural networks. that, aggregated obtain final 3d poses using proposed robust optimization formula. overall procedure fine-tuned end-to-end fashion, resulting better performance. experiments, proposed framework shows state-of-the-art performances popular benchmark data sets, namely human3.6m humaneva, demonstrate effectiveness proposed framework.",4 "self-organizing symbiotic agent. [n. a. baas, emergence, hierarchies, hyper-structures, c.g. langton ed., artificial life iii, addison wesley, 1994.] general framework study emergence hyper-structure presented. approach mostly concerned description systems. paper try bring forth different aspect model feel useful engineering agent based solutions, namely symbiotic approach. approach self-organizing method dividing complex ""main-problem"" hyper-structure ""sub-problems"" aim reducing complexity desired. description general problem given along instances related work. paper intended serve introductory challenge general solutions described problem.",4 "statistical analysis privacy anonymity guarantees randomized security protocol implementations. security protocols often use randomization achieve probabilistic non-determinism. non-determinism, turn, used obfuscating dependence observable values secret data. since correctness security protocols important, formal analysis security protocols widely studied literature. randomized security protocols also analyzed using formal techniques process-calculi probabilistic model checking. paper, consider problem validating implementations randomized protocols. unlike previous approaches treat protocol white-box, approach tries verify implementation provided black box. goal infer secrecy guarantees provided security protocol statistical techniques. learn probabilistic dependency observable outputs secret inputs using bayesian network. used approximate leakage secret. order evaluate accuracy statistical approach, compare technique probabilistic model checking technique two examples: crowds protocol dining crypotgrapher's protocol.",4 "blitzkriging: kronecker-structured stochastic gaussian processes. present blitzkriging, new approach fast inference gaussian processes, applicable regression, optimisation classification. state-of-the-art (stochastic) inference gaussian processes large datasets scales cubically number 'inducing inputs', variables introduced factorise model. blitzkriging shares state-of-the-art scaling data, reduces scaling number inducing points approximately linear. further, contrast methods, blitzkriging: force data conform particular structure (including grid-like); reduces reliance error-prone optimisation inducing point locations; able learn rich (covariance) structure data. demonstrate benefits approach real data regression, time-series prediction signal-interpolation experiments.",19 "direct maximization quadratic weighted kappa. recent years, quadratic weighted kappa growing popularity machine learning community evaluation metric domains target labels predicted drawn integer ratings, usually obtained human experts. example, metric choice several recent, high profile machine learning contests hosted kaggle : https://www.kaggle.com/c/asap-aes , https://www.kaggle.com/c/asap-sas , https://www.kaggle.com/c/diabetic-retinopathy-detection . yet, little understood nature metric, underlying mathematical properties, fits among common evaluation metrics mean squared error (mse) correlation, optimized analytically, so, how. much due cumbersome way metric commonly defined. paper first derive equivalent much simpler, useful, definition quadratic weighted kappa, employ alternate form address issues.",4 "dna2vec: consistent vector representations variable-length k-mers. one ubiquitous representation long dna sequence dividing shorter k-mer components. unfortunately, straightforward vector encoding k-mer one-hot vector vulnerable curse dimensionality. worse yet, distance pair one-hot vectors equidistant. particularly problematic applying latest machine learning algorithms solve problems biological sequence analysis. paper, propose novel method train distributed representations variable-length k-mers. method based popular word embedding model word2vec, trained shallow two-layer neural network. experiments provide evidence summing dna2vec vectors akin nucleotides concatenation. also demonstrate correlation needleman-wunsch similarity score cosine similarity dna2vec vectors.",16 text-to-speech conversion neural networks: recurrent tdnn approach. paper describes design neural network performs phonetic-to-acoustic mapping speech synthesis system. use time-domain neural network architecture limits discontinuities occur phone boundaries. recurrent data input also helps smooth output parameter tracks. independent testing demonstrated voice quality produced system compares favorably speech existing commercial text-to-speech systems.,4 "beyond l2-loss functions learning sparse models. incorporating sparsity priors learning tasks give rise simple, interpretable models complex high dimensional data. sparse models found widespread use structure discovery, recovering data corruptions, variety large scale unsupervised supervised learning problems. assuming availability sufficient data, methods infer dictionaries sparse representations optimizing high-fidelity reconstruction. scenarios, reconstruction quality measured using squared euclidean distance, efficient algorithms developed batch online learning cases. however, new application domains motivate looking beyond conventional loss functions. example, robust loss functions $\ell_1$ huber useful learning outlier-resilient models, quantile loss beneficial discovering structures representative particular quantile. new applications motivate work generalizing sparse learning broad class convex loss functions. particular, consider class piecewise linear quadratic (plq) cost functions includes huber, well $\ell_1$, quantile, vapnik, hinge loss, smoothed variants penalties. propose algorithm learn dictionaries obtain sparse codes data reconstruction fidelity measured using smooth plq cost function. provide convergence guarantees proposed algorithm, demonstrate convergence behavior using empirical experiments. furthermore, present three case studies require use plq cost functions: (i) robust image modeling, (ii) tag refinement image annotation retrieval (iii) computing empirical confidence limits subspace clustering.",19 "unsupervised cross-dataset person re-identification transfer learning spatial-temporal patterns. proposed person re-identification algorithms conduct supervised training testing single labeled datasets small size, directly deploying trained models large-scale real-world camera network may lead poor performance due underfitting. challenging incrementally optimize models using abundant unlabeled data collected target domain. address challenge, propose unsupervised incremental learning algorithm, tfusion, aided transfer learning pedestrians' spatio-temporal patterns target domain. specifically, algorithm firstly transfers visual classifier trained small labeled source dataset unlabeled target dataset learn pedestrians' spatial-temporal patterns. secondly, bayesian fusion model proposed combine learned spatio-temporal patterns visual features achieve significantly improved classifier. finally, propose learning-to-rank based mutual promotion procedure incrementally optimize classifiers based unlabeled data target domain. comprehensive experiments based multiple real surveillance datasets conducted, results show algorithm gains significant improvement compared state-of-art cross-dataset unsupervised person re-identification algorithms.",4 "corrective training algorithm adaptive learning bag generation. sampling problem training corpus one major sources errors corpus-based applications. paper proposes corrective training algorithm best-fit run-time context domain application bag generation. shows objects adjusted adjust probabilities. resulting techniques greatly simplified experimental results demonstrate promising effects training algorithm generic domain specific domain. general, techniques easily extended various language models corpus-based applications.",2 "learning directed acyclic graphs penalized neighbourhood regression. study family regularized score-based estimators learning structure directed acyclic graph (dag) multivariate normal distribution high-dimensional data $p\gg n$. main results establish support recovery guarantees deviation bounds family penalized least-squares estimators concave regularization without assuming prior knowledge variable ordering. results apply variety practical situations allow arbitrary nondegenerate covariance structures well many popular regularizers including mcp, scad, $\ell_{0}$ $\ell_{1}$. proof relies interpreting dag recursive linear structural equation model, reduces estimation problem series neighbourhood regressions. provide novel statistical analysis neighbourhood problems, establishing uniform control superexponential family neighbourhoods associated gaussian distribution. apply results study statistical properties score-based dag estimators, learning causal dags, inferring conditional independence relations via graphical models. results yield---for first time---finite-sample guarantees structure learning gaussian dags high-dimensions via score-based estimation.",12 "overview annotation creation: processes & tools. creating linguistic annotations requires reliable annotation scheme. annotation complex endeavour potentially involving many people, stages, tools. chapter outlines process creating end-to-end linguistic annotations, identifying specific tasks researchers often perform. tool support central achieving high quality, reusable annotations low cost, focus identifying capabilities necessary useful annotation tools, well common problems tools present reduce utility. although examples specific tools provided many cases, chapter concentrates abstract capabilities problems new tools appear continuously, old tools disappear disuse disrepair. two core capabilities tools must support chosen annotation scheme ability work language study. additional capabilities organized three categories: widely provided; often useful found tools; yet little available tool support.",4 "tumour ellipsification ultrasound images treatment prediction breast cancer. recent advances using quantitative ultrasound (qus) methods provided promising framework non-invasively inexpensively monitor predict effectiveness therapeutic cancer responses. one earliest steps using qus methods contouring region interest (roi) inside tumour ultrasound b-mode images. manual segmentation time-consuming tedious task human experts, auto-contouring also extremely difficult task computers due poor quality ultrasound b-mode images. however, purpose cancer response prediction, rough boundary tumour roi needed. research, semi-automated tumour localization approach proposed roi estimation ultrasound b-mode images acquired patients locally advanced breast cancer (labc). proposed approach comprised several modules, including 1) feature extraction using keypoint descriptors, 2) augmenting feature descriptors distance keypoints user-input pixel centre tumour, 3) supervised learning using support vector machine (svm) classify keypoints ""tumour"" ""non-tumour"", 4) computation ellipse outline roi representing tumour. experiments 33 b-mode images 10 labc patients yielded promising results accuracy 76.7% based dice coefficient performance measure. results demonstrated proposed method potentially used first stage computer-assisted cancer response prediction system semi-automated contouring breast tumours.",4 "efficient baseline-free sampling parameter exploring policy gradients: super symmetric pgpe. policy gradient methods explore directly parameter space among effective robust direct policy search methods drawn lot attention lately. basic method field, policy gradients parameter-based exploration, uses two samples symmetric around current hypothesis circumvent misleading reward \emph{asymmetrical} reward distributed problems gathered usual baseline approach. exploration parameters still updated baseline approach - leaving exploration prone asymmetric reward distributions. paper show exploration parameters sampled quasi symmetric despite limited instead free parameters exploration. give transformation approximation get quasi symmetric samples respect exploration without changing overall sampling distribution. finally demonstrate sampling symmetrically also exploration parameters superior needs samples robustness original sampling approach.",4 "infinite tucker decomposition: nonparametric bayesian models multiway data analysis. tensor decomposition powerful computational tool multiway data analysis. many popular tensor decomposition approaches---such tucker decomposition candecomp/parafac (cp)---amount multi-linear factorization. insufficient model (i) complex interactions data entities, (ii) various data types (e.g. missing data binary data), (iii) noisy observations outliers. address issues, propose tensor-variate latent nonparametric bayesian models, coupled efficient inference methods, multiway data analysis. name models inftucker. using inftucker, conduct tucker decomposition infinite feature space. unlike classical tensor decomposition models, new approaches handle continuous binary data probabilistic framework. unlike previous bayesian models matrices tensors, models based latent gaussian $t$ processes nonlinear covariance functions. efficiently learn inftucker data, develop variational inference technique tensors. compared classical implementation, new technique reduces time space complexities several orders magnitude. experimental results chemometrics social network datasets demonstrate new models achieved significantly higher prediction accuracy state-of-art tensor decomposition",4 "stochastic imt (insulator-metal-transition) neurons: interplay thermal threshold noise bifurcation. stochastic neuron, key hardware kernel implementing stochastic neural networks, constructed using insulator-metal-transition (imt) device based electrically induced phase-transition series tunable resistance. show imt neuron dynamics similar piecewise linear fitzhugh-nagumo (fhn) neuron. spiking statistics neurons demonstrated experimentally using vanadium dioxide (vo$_{2}$) based imt neurons, modeled ornstein-uhlenbeck (ou) process fluctuating boundary. stochastic spiking explained thermal noise threshold fluctuations acting precursors bifurcation result sigmoid-like transfer function. moments interspike intervals calculated analytically extending first-passage-time (fpt) models ornstein-uhlenbeck (ou) process include fluctuating boundary. find coefficient variation interspike intervals depend relative proportion thermal threshold noise. current experimental demonstrations kinds noise present, coefficient variation order magnitude higher compared case thermal noise present.",4 "note statistical view matrix completion. simple interpretation matrix completion problem introduced based statistical models. combined well-known results missing data analysis, interpretation indicates matrix completion still valid principled estimation procedure even without missing completely random (mcar) assumption, almost current theoretical studies matrix completion assume.",19 "klasifikasi komponen argumen secara otomatis pada dokumen teks berbentuk esai argumentatif. automatically recognize argument component, essay writers inspections texts written. assist essay scoring process objectively precisely essay grader able see well argument components constructed. reseachers tried argument detection classification along implementation domains. common approach feature extraction text. generally, features structural, lexical, syntactic, indicator, contextual. research, add new feature existing features. adopts keywords list knott dale (1993). experiment result shows argument classification achieves 72.45% accuracy. moreover, still get accuracy without keyword lists. concludes keyword lists affect significantly features. features still weak classify major claim claim, need features useful differentiate two kind argument components.",4 "expert system automatic reading text written standard arabic. work present expert system automatic reading speech synthesis based text written standard arabic, work carried two great stages: creation sound data base, transformation written text speech (text speech tts). transformation done firstly phonetic orthographical transcription (pot) written standard arabic text aim transforming corresponding phonetics sequence, secondly generation voice signal corresponds chain transcribed. spread different conception system, well results obtained compared others works studied realize tts based standard arabic.",4 "beyond volume: impact complex healthcare data machine learning pipeline. medical charts national census, healthcare traditionally operated paper-based paradigm. however, past decade marked long arduous transformation bringing healthcare digital age. ranging electronic health records, digitized imaging laboratory reports, public health datasets, today, healthcare generates incredible amount digital information. wealth data presents exciting opportunity integrated machine learning solutions address problems across multiple facets healthcare practice administration. unfortunately, ability derive accurate informative insights requires ability execute machine learning models. rather, deeper understanding data models run imperative success. significant effort undertaken develop models able process volume data obtained analysis millions digitalized patient records, important remember volume represents one aspect data. fact, drawing data increasingly diverse set sources, healthcare data presents incredibly complex set attributes must accounted throughout machine learning pipeline. chapter focuses highlighting challenges, broken three distinct components, representing phase pipeline. begin attributes data accounted preprocessing, move considerations model building, end challenges interpretation model output. component, present discussion around data relates healthcare domain offer insight challenges may impose efficiency machine learning techniques.",4 "probe-gk: predictive robust estimation using generalized kernels. many algorithms computer vision robotics make strong assumptions uncertainty, rely validity assumptions produce accurate consistent state estimates. practice, dynamic environments may degrade sensor performance predictable ways cannot captured static uncertainty parameters. paper, employ fast nonparametric bayesian inference techniques accurately model sensor uncertainty. setting prior observation uncertainty, derive predictive robust estimator, show model learned sample images, without knowledge motion used generate data. validate approach monte carlo simulations, report significant improvements localization accuracy relative fixed noise model several settings, including synthetic data, kitti dataset, experimental platform.",4 "fast keypoint detection video sequences. number computer vision tasks exploit succinct representation visual content form sets local features. given input image, feature extraction algorithms identify set keypoints assign description vector, based characteristics visual content surrounding interest point. several tasks might require local features extracted video sequence, frame-by-frame basis. although temporal downsampling proven effective solution mobile augmented reality visual search, high temporal resolution key requirement time-critical applications object tracking, event recognition, pedestrian detection, surveillance. recent years, computationally efficient visual feature detectors decriptors proposed. nonetheless, approaches tailored still images. paper propose fast keypoint detection algorithm video sequences, exploits temporal coherence sequence keypoints. according proposed method, frame preprocessed identify parts input frame keypoint detection description need performed. experiments show possible achieve reduction computational time 40%, without significantly affecting task accuracy.",4 "viplfacenet: open source deep face recognition sdk. robust face representation imperative highly accurate face recognition. work, propose open source face recognition method deep representation named viplfacenet, 10-layer deep convolutional neural network 7 convolutional layers 3 fully-connected layers. compared well-known alexnet, viplfacenet takes 20% training time 60% testing time, achieves 40\% drop error rate real-world face recognition benchmark lfw. viplfacenet achieves 98.60% mean accuracy lfw using one single network. open-source c++ sdk based viplfacenet released bsd license. sdk takes 150ms process one face image single thread i7 desktop cpu. viplfacenet provides state-of-the-art start point academic industrial face recognition applications.",4 "multi-task convolutional neural network mega-city analysis using high resolution satellite imagery geospatial data. mega-city analysis high resolution (vhr) satellite images drawing increasing interest fields city planning social investigation. known accurate land-use, urban density, population distribution information key mega-city monitoring environmental studies. therefore, generate land-use, urban density, population distribution maps fine scale using vhr satellite images become hot topic. previous studies focused solely individual tasks elaborate hand-crafted features ignored relationship different tasks. study, aim propose universal framework can: 1) automatically learn internal feature representation raw image data; 2) simultaneously produce fine-scale land-use, urban density, population distribution maps. first target, deep convolutional neural network (cnn) applied learn hierarchical feature representation raw image data. second target, novel cnn-based universal framework proposed process vhr satellite images generate land-use, urban density, population distribution maps. best knowledge, first cnn-based mega-city analysis method process vhr remote sensing image large data volume. vhr satellite image (1.2 spatial resolution) center wuhan covering area 2606 km2 used evaluate proposed method. experimental results confirm proposed method achieve promising accuracy land-use, urban density, population distribution maps.",4 "blackout: speeding recurrent neural network language models large vocabularies. propose blackout, approximation algorithm efficiently train massive recurrent neural network language models (rnnlms) million word vocabularies. blackout motivated using discriminative loss, describe new sampling strategy significantly reduces computation improving stability, sample efficiency, rate convergence. one way understand blackout view extension dropout strategy output layer, wherein use discriminative training loss weighted sampling scheme. also establish close connections blackout, importance sampling, noise contrastive estimation (nce). experiments, recently released one billion word language modeling benchmark, demonstrate scalability accuracy blackout; outperform state-of-the art, achieve lowest perplexity scores dataset. moreover, unlike established methods typically require gpus cpu clusters, show carefully implemented version blackout requires 1-10 days single machine train rnnlm million word vocabulary billions parameters one billion words. although describe blackout context rnnlm training, used networks large softmax output layers.",4 "asr context-sensitive error correction based microsoft n-gram dataset. present time, computers employed solve complex tasks problems ranging simple calculations intensive digital image processing intricate algorithmic optimization problems computationally-demanding weather forecasting problems. asr short automatic speech recognition yet another type computational problem whose purpose recognize human spoken speech convert text processed computer. despite asr many versatile pervasive real-world applications,it still relatively erroneous perfectly solved prone produce spelling errors recognized text, especially asr system operating noisy environment, vocabulary size limited, input speech bad low quality. paper proposes post-editing asr error correction method based microsoftn-gram dataset detecting correcting spelling errors generated asr systems. proposed method comprises error detection algorithm detecting word errors; candidate corrections generation algorithm generating correction suggestions detected word errors; context-sensitive error correction algorithm selecting best candidate correction. virtue using microsoft n-gram dataset contains real-world data word sequences extracted web canmimica comprehensive dictionary words large all-inclusive vocabulary. experiments conducted numerous speeches, performed different speakers, showed remarkable reduction asr errors. future research improve upon proposed algorithm much parallelized take advantage multiprocessor distributed systems.",4 "linguistic markers influence informal interactions. long standing interest understanding `social influence' social sciences computational linguistics. paper, present novel approach study measure interpersonal influence daily interactions. motivated basic principles influence, attempt identify indicative linguistic features posts online knitting community. present scheme used operationalize label posts indicator features. experiments identified features show improvement classification accuracy influence 3.15%. results illustrate important correlation characteristics language potential influence others.",4 "using dynamic neural field model explore direct collicular inhibition account inhibition return. interval transient ash light (a ""cue"") second visual response signal (a ""target"") exceeds least 200ms, responding slowest direction indicated first signal. phenomenon commonly referred inhibition return (ior). dynamic neural field model (dnf) proven broad explanatory power ior, effectively capturing many empirical results. previous work used short-term depression (std) implementation ior, approach fails explain many behavioral phenomena observed literature. here, explore variant model ior involving combination std delayed direct collicular inhibition. demonstrate hybrid model better reproduce established behavioural results. use results model propose several experiments would yield particularly valuable insight nature neurophysiological mechanisms underlying ior.",16 "noisy matrix decomposition via convex relaxation: optimal rates high dimensions. analyze class estimators based convex relaxation solving high-dimensional matrix decomposition problems. observations noisy realizations linear transformation $\mathfrak{x}$ sum approximately) low rank matrix $\theta^\star$ second matrix $\gamma^\star$ endowed complementary form low-dimensional structure; set-up includes many statistical models interest, including factor analysis, multi-task regression, robust covariance estimation. derive general theorem bounds frobenius norm error estimate pair $(\theta^\star, \gamma^\star)$ obtained solving convex optimization problem combines nuclear norm general decomposable regularizer. results utilize ""spikiness"" condition related milder singular vector incoherence. specialize general result two cases studied past work: low rank plus entrywise sparse matrix, low rank plus columnwise sparse matrix. models, theory yields non-asymptotic frobenius error bounds deterministic stochastic noise matrices, applies matrices $\theta^\star$ exactly approximately low rank, matrices $\gamma^\star$ exactly approximately sparse. moreover, case stochastic noise matrices identity observation operator, establish matching lower bounds minimax error. sharpness predictions confirmed numerical simulations.",19 "non-convex weighted lp nuclear norm based admm framework image restoration. since matrix formed nonlocal similar patches natural image low rank, nuclear norm minimization (nnm) widely used various image processing studies. nonetheless, nuclear norm based convex surrogate rank function usually over-shrinks rank components makes different components equally, thus may produce result far optimum. alleviate above-mentioned limitations nuclear norm, paper propose new method image restoration via non-convex weighted lp nuclear norm minimization (ncw-nnm), able accurately enforce image structural sparsity self-similarity simultaneously. make proposed model tractable robust, alternative direction multiplier method (admm) adopted solve associated non-convex minimization problem. experimental results various types image restoration problems, including image deblurring, image inpainting image compressive sensing (cs) recovery, demonstrate proposed method outperforms many current state-of-the-art methods objective perceptual qualities.",4 "learning theoretic approach energy harvesting communication system optimization. point-to-point wireless communication system transmitter equipped energy harvesting device rechargeable battery, studied. energy data arrivals transmitter modeled markov processes. delay-limited communication considered assuming underlying channel block fading memory, instantaneous channel state information available transmitter receiver. expected total transmitted data transmitter's activation time maximized three different sets assumptions regarding information available transmitter underlying stochastic processes. learning theoretic approach introduced, assume priori information markov processes governing communication system. addition, online offline optimization problems studied setting. full statistical knowledge causal information realizations underlying stochastic processes assumed online optimization problem, offline optimization problem assumes non-causal knowledge realizations advance. comparing optimal solutions three frameworks, performance loss due lack transmitter's information regarding behaviors underlying markov processes quantified.",4 "minimal dirichlet energy partitions graphs. motivated geometric problem, introduce new non-convex graph partitioning objective optimality criterion given sum dirichlet eigenvalues partition components. relaxed formulation identified novel rearrangement algorithm proposed, show strictly decreasing converges finite number iterations local minimum relaxed objective function. method applied several clustering problems graphs constructed synthetic data, mnist handwritten digits, manifold discretizations. model semi-supervised extension provides natural representative clusters well.",12 "neural lattice language models. work, propose new language modeling paradigm ability perform prediction moderation information flow multiple granularities: neural lattice language models. models construct lattice possible paths sentence marginalize across lattice calculate sequence probabilities optimize parameters. approach allows us seamlessly incorporate linguistic intuitions - including polysemy existence multi-word lexical items - language model. experiments multiple language modeling tasks show english neural lattice language models utilize polysemous embeddings able improve perplexity 9.95% relative word-level baseline, chinese model handles multi-character tokens able improve perplexity 20.94% relative character-level baseline.",4 "characterness: indicator text wild. text image provides vital information interpreting contents, text scene aide variety tasks navigation, obstacle avoidance, odometry. despite value, however, identifying general text images remains challenging research problem. motivated need consider widely varying forms natural text, propose bottom-up approach problem reflects `characterness' image region. sense approach mirrors move saliency detection methods measures `objectness'. order measure characterness develop three novel cues tailored character detection, bayesian method integration. text made sets characters, design markov random field (mrf) model exploit inherent dependencies characters. experimentally demonstrate effectiveness characterness cues well advantage bayesian multi-cue integration. proposed text detector outperforms state-of-the-art methods benchmark scene text detection datasets. also show measurement `characterness' superior state-of-the-art saliency detection models applied task.",4 "porting htm models heidelberg neuromorphic computing platform. hierarchical temporal memory (htm) computational theory machine intelligence based detailed study neocortex. heidelberg neuromorphic computing platform, developed part human brain project (hbp), mixed-signal (analog digital) large-scale platform modeling networks spiking neurons. paper present first effort porting htm networks platform. describe framework simulating key htm operations using spiking network models. describe specific spatial pooling temporal memory implementations, well simulations demonstrating fundamental properties maintained. discuss issues implementing full set plasticity rules using spike-timing dependent plasticity (stdp), rough place route calculations. although work required, initial studies indicate possible run large-scale htm networks (including plasticity rules) efficiently heidelberg platform. generally exercise porting high level htm algorithms biophysical neuron models promises fruitful area investigation future studies.",16 "tangent bundle manifold learning via grassmann&stiefel eigenmaps. one ultimate goals manifold learning (ml) reconstruct unknown nonlinear low-dimensional manifold embedded high-dimensional observation space given set data points manifold. derive local lower bound maximum reconstruction error small neighborhood arbitrary point. lower bound defined terms distance tangent spaces original manifold estimated manifold considered point reconstructed point, respectively. propose amplification ml, called tangent bundle ml, proximity original manifold estimator also tangent spaces required. present new algorithm solves problem gives new solution ml also.",4 "spherical paragraph model. representing texts fixed-length vectors central many language processing tasks. traditional methods build text representations based simple bag-of-words (bow) representation, loses rich semantic relations words. recent advances natural language processing shown semantically meaningful representations words efficiently acquired distributed models, making possible build text representations based better foundation called bag-of-word-embedding (bowe) representation. however, existing text representation methods using bowe often lack sound probabilistic foundations cannot well capture semantic relatedness encoded word vectors. address problems, introduce spherical paragraph model (spm), probabilistic generative model based bowe, text representation. spm good probabilistic interpretability fully leverage rich semantics words, word co-occurrence information well corpus-wide information help representation learning texts. experimental results topical classification sentiment analysis demonstrate spm achieve new state-of-the-art performances several benchmark datasets.",4 "network embedding matrix factorization: unifying deepwalk, line, pte, node2vec. since invention word2vec, skip-gram model significantly advanced research network embedding, recent emergence deepwalk, line, pte, node2vec approaches. work, show aforementioned models negative sampling unified matrix factorization framework closed forms. analysis proofs reveal that: (1) deepwalk empirically produces low-rank transformation network's normalized laplacian matrix; (2) line, theory, special case deepwalk size vertices' context set one; (3) extension line, pte viewed joint factorization multiple networks' laplacians; (4) node2vec factorizing matrix related stationary distribution transition probability tensor 2nd-order random walk. provide theoretical connections skip-gram based network embedding algorithms theory graph laplacian. finally, present netmf method well approximation algorithm computing network embedding. method offers significant improvements deepwalk line conventional network mining tasks. work lays theoretical foundation skip-gram based network embedding methods, leading better understanding latent network representation learning.",4 "deep successor reinforcement learning. learning robust value functions given raw observations rewards possible model-free model-based deep reinforcement learning algorithms. third alternative, called successor representations (sr), decomposes value function two components -- reward predictor successor map. successor map represents expected future state occupancy given state reward predictor maps states scalar rewards. value function state computed inner product successor map reward weights. paper, present dsr, generalizes sr within end-to-end deep reinforcement learning framework. dsr several appealing properties including: increased sensitivity distal reward changes due factorization reward world dynamics, ability extract bottleneck states (subgoals) given successor maps trained random policy. show efficacy approach two diverse environments given raw pixel observations -- simple grid-world domains (mazebase) doom game engine.",19 "intrinsic dimension estimation data principal component analysis. estimating intrinsic dimensionality data classic problem pattern recognition statistics. principal component analysis (pca) powerful tool discovering dimensionality data sets linear structure; it, however, becomes ineffective data nonlinear structure. paper, propose new pca-based method estimate intrinsic dimension data nonlinear structures. method works first finding minimal cover data set, performing pca locally subset cover finally giving estimation result checking data variance small neighborhood regions. proposed method utilizes whole data set estimate intrinsic dimension convenient incremental learning. addition, new pca procedure filter noise data converge stable estimation neighborhood region size increasing. experiments synthetic real world data sets show effectiveness proposed method.",4 "towards interpretable deep neural networks leveraging adversarial examples. deep neural networks (dnns) demonstrated impressive performance wide array tasks, usually considered opaque since internal structure learned parameters interpretable. paper, re-examine internal representations dnns using adversarial images, generated ensemble-optimization algorithm. find that: (1) neurons dnns truly detect semantic objects/parts, respond objects/parts recurrent discriminative patches; (2) deep visual representations robust distributed codes visual concepts representations adversarial images largely consistent real images, although similar visual appearance, different previous findings. improve interpretability dnns, propose adversarial training scheme consistent loss neurons endowed human-interpretable concepts. induced interpretable representations enable us trace eventual outcomes back influential neurons. therefore, human users know models make predictions, well make errors.",4 "updating sets probabilities. several well-known justifications conditioning appropriate method updating single probability measure, given observation. however, significant body work arguing sets probability measures, rather single measures, realistic model uncertainty. conditioning still makes sense context--we simply condition measure set individually, combine results--and, indeed, seems preferred updating procedure literature. justified conditioning richer setting? show, considering axiomatic account conditioning given van fraassen, single-measure sets-of-measures cases different. show van fraassen's axiomatization former case nowhere near sufficient updating sets measures. give considerably longer (and compelling) list axioms together force conditioning setting, describe update methods allowed axioms dropped.",4 "algorithmic identification probabilities. tthe problem identify probability associated set natural numbers, given infinite data sequence elements set. given sequence drawn i.i.d. probability mass function involved (the target) belongs computably enumerable (c.e.) co-computably enumerable (co-c.e.) set computable probability mass functions, algorithm almost surely identify target limit. technical tool strong law large numbers. set finite elements sequence dependent sequence typical sense martin-l\""of least one measure belonging c.e. co-c.e. set computable measures, algorithm identify limit computable measure sequence typical (there may one measure). technical tool theory kolmogorov complexity. give algorithms consider associated predictions.",4 "detecting ontological conflicts protocols semantic web services. task verifying compatibility interacting web services traditionally limited checking compatibility interaction protocol terms message sequences type data exchanged. since web services developed largely uncoordinated way, different services often use independently developed ontologies domain instead adhering single ontology standard. work investigate approaches taken server verify possibility reach state semantically inconsistent results execution protocol client, client ontology published. often database used store actual data along ontologies instead storing actual data part ontology description. important observe current state database semantic conflict state may reached even verification done server indicates possibility reaching conflict state. relational algebra based decision procedure also developed incorporate current state client server databases overall verification procedure.",4 "semantics gringo. input languages answer set solvers based mathematically simple concept stable model. many useful constructs available languages, including local variables, conditional literals, aggregates, cannot easily explained terms stable models sense original definition concept straightforward generalizations. manuals written designers answer set solvers usually explain constructs using examples informal comments appeal user's intuition, without references precise semantics. propose approach problem defining semantics gringo programs translating language infinitary propositional formulas. semantics allows us study equivalent transformations gringo programs using natural deduction infinitary propositional logic.",4 "biometric signature processing & recognition using radial basis function network. automatic recognition signature challenging problem received much attention recent years due many applications different fields. signature used long time verification authentication purpose. earlier methods manual nowadays getting digitized. paper provides efficient method signature recognition using radial basis function network. network trained sample images database. feature extraction performed using training. testing purpose, image made undergo rotation-translation-scaling correction given network. network successfully identifies original image gives correct output stored database images also. method provides recognition rate approximately 80% 200 samples.",4 "toward marker-free 3d pose estimation lifting: deep multi-view solution. lifting common manual material handling task performed workplaces. considered one main risk factors work-related musculoskeletal disorders. improve work place safety, necessary assess musculoskeletal biomechanical risk exposures associated tasks, requires accurate 3d pose. existing approaches mainly utilize marker-based sensors collect 3d information. however, methods usually expensive setup, time-consuming process, sensitive surrounding environment. study, propose multi-view based deep perceptron approach address aforementioned limitations. approach consists two modules: ""view-specific perceptron"" network extracts rich information independently image view, includes 2d shape hierarchical texture information; ""multi-view integration"" network synthesizes information available views predict accurate 3d pose. fully evaluate approach, carried comprehensive experiments compare different variants design. results prove approach achieves comparable performance former marker-based methods, i.e. average error $14.72 \pm 2.96$ mm lifting dataset. results also compared state-of-the-art methods humaneva-i dataset, demonstrates superior performance approach.",4 "learning kernel-based halfspaces zero-one loss. describe analyze new algorithm agnostically learning kernel-based halfspaces respect \emph{zero-one} loss function. unlike previous formulations rely surrogate convex loss functions (e.g. hinge-loss svm log-loss logistic regression), provide finite time/sample guarantees respect natural zero-one loss function. proposed algorithm learn kernel-based halfspaces worst-case time $\poly(\exp(l\log(l/\epsilon)))$, $\emph{any}$ distribution, $l$ lipschitz constant (which thought reciprocal margin), learned classifier worse optimal halfspace $\epsilon$. also prove hardness result, showing certain cryptographic assumption, algorithm learn kernel-based halfspaces time polynomial $l$.",4 "dealing uncertainty situation assessment: towards symbolic approach. situation assessment problem considered, terms object, condition, activity, plan recognition, based data coming real-word {em via} various sensors. shown uncertainty issues linked models matching algorithm. three different types uncertainties identified, within one, numerical symbolic cases distinguished. emphasis put purely symbolic uncertainties: shown dealt within purely symbolic framework resulting transposition classical numerical estimation tools.",4 "proof theoretic view constraint programming. provide proof theoretic account constraint programming attempts capture essential ingredients programming style. exemplify presenting proof rules linear constraints interval domains, illustrate use analyzing constraint propagation process {\tt send + = money} puzzle. also show approach allows one build new constraint solvers.",4 "evolutionary turing context evolutionary machines. one roots evolutionary computation idea turing unorganized machines. goal work development foundations evolutionary computations, connecting turing's ideas contemporary state art evolutionary computations. achieve goal, develop general approach evolutionary processes computational context, building mathematical models computational systems, functioning based evolutionary processes, studying properties systems. operations evolutionary machines described explored definite classes evolutionary machines closed respect basic operations machines. also study properties linguistic functional equivalence evolutionary machines classes, well computational power evolutionary machines classes, comparing evolutionary machines conventional automata, finite automata turing machines.",4 "fpga-based massively parallel neuromorphic cortex simulator. paper presents massively parallel scalable neuromorphic cortex simulator designed simulating large structurally connected spiking neural networks, complex models various areas cortex. main novelty work abstraction neuromorphic architecture clusters represented minicolumns hypercolumns, analogously fundamental structural units observed neurobiology. without approach, simulating large-scale fully connected networks needs prohibitively large memory store look-up tables point-to-point connections. instead, use novel architecture, based structural connectivity neocortex, required parameters connections stored on-chip memory. cortex simulator easily reconfigured simulating different neural networks without change hardware structure programming memory. hierarchical communication scheme allows one neuron fan-out 200k neurons. proof-of-concept, implementation one altera stratix v fpga able simulate 20 million 2.6 billion leaky-integrate-and-fire (lif) neurons real time. verified system emulating simplified auditory cortex (with 100 million neurons). cortex simulator achieved low power dissipation 1.62 {\mu}w per neuron. advent commercially available fpga boards, system offers accessible scalable tool design, real-time simulation, analysis large-scale spiking neural networks.",4 "realization ontology web search engine. paper describes realization ontology web search engine. ontology web search engine realizable independent project part projects. main purpose paper present ontology web search engine realization details part semantic web expert system present results ontology web search engine functioning. expected semantic web expert system able process ontologies web, generate rules ontologies develop knowledge base.",4 "variational probabilistic inference qmr-dt network. describe variational approximation method efficient inference large-scale probabilistic models. variational methods deterministic procedures provide approximations marginal conditional probabilities interest. provide alternatives approximate inference methods based stochastic sampling search. describe variational approach problem diagnostic inference `quick medical reference' (qmr) network. qmr network large-scale probabilistic graphical model built statistical expert knowledge. exact probabilistic inference infeasible model small set cases. evaluate variational inference algorithm large set diagnostic test cases, comparing algorithm state-of-the-art stochastic sampling method.",4 "multi-information source optimization. consider bayesian optimization expensive-to-evaluate black-box objective function, also access cheaper approximations objective. general, approximations arise applications reinforcement learning, engineering, natural sciences, subject inherent, unknown bias. model discrepancy caused inadequate internal model deviates reality vary domain, making utilization approximations non-trivial task. present novel algorithm provides rigorous mathematical treatment uncertainties arising model discrepancies noisy observations. optimization decisions rely value information analysis extends knowledge gradient factor setting multiple information sources vary cost: sampling decision maximizes predicted benefit per unit cost. conduct experimental evaluation demonstrates method consistently outperforms state-of-the-art techniques: finds designs considerably higher objective value additionally inflicts less cost exploration process.",19 "completing low-rank matrices corrupted samples coefficients general basis. subspace recovery corrupted missing data crucial various applications signal processing information theory. complete missing values detect column corruptions, existing robust matrix completion (mc) methods mostly concentrate recovering low-rank matrix corrupted coefficients w.r.t. standard basis, which, however, apply general basis, e.g., fourier basis. paper, prove range space $m\times n$ matrix rank $r$ exactly recovered coefficients w.r.t. general basis, though $r$ number corrupted samples high $o(\min\{m,n\}/\log^3 (m+n))$. model covers previous ones special cases, robust mc recover intrinsic matrix higher rank. moreover, suggest universal choice regularization parameter, $\lambda=1/\sqrt{\log n}$. $\ell_{2,1}$ filtering algorithm, theoretical guarantees, reduce computational cost model. application, also find solutions extended robust low-rank representation extended robust mc mutually expressible, theory algorithm applied subspace clustering problem missing values certain conditions. experiments verify theories.",4 "three-stage quantitative neural network model tip-of-the-tongue phenomenon. new three-stage computer artificial neural network model tip-of-the-tongue phenomenon shortly described, stochastic nature demonstrated. way calculate strength appearance probability tip-of-the-tongue states, neural network mechanism feeling-of-knowing phenomenon proposed. model synthesizes memory, psycholinguistic, metamemory approaches, bridges speech errors naming chronometry research traditions. model analysis tip-of-the-tongue case anton chekhov's short story 'a horsey name' performed. new 'throw-up-one's-arms effect' defined.",4 "denser: deep evolutionary network structured representation. deep evolutionary network structured representation (denser) novel approach automatically design artificial neural networks (anns) using evolutionary computation (ec). algorithm searches best network topology (e.g., number layers, type layers), also tunes hyper-parameters, as, learning parameters data augmentation parameters. automatic design achieved using representation two distinct levels, outer level encodes general structure network, i.e., sequence layers, inner level encodes parameters associated layer. allowed layers hyper-parameter value ranges defined means human-readable context-free grammar. denser used evolve anns two widely used image classification benchmarks obtaining average accuracy result 94.27% cifar-10 dataset, 78.75% cifar-100. best knowledge, cifar-100 results highest performing models generated methods aim automatic design convolutional neural networks (cnns), amongst best manually designed fine-tuned cnns .",4 "compilig semeval-2017 task 1: cross-language plagiarism detection methods semantic textual similarity. present submitted systems semantic textual similarity (sts) track 4 semeval-2017. given pair spanish-english sentences, system must estimate semantic similarity score 0 5. submission, use syntax-based, dictionary-based, context-based, mt-based methods. also combine methods unsupervised supervised way. best run ranked 1st track 4a correlation 83.02% human annotations.",4 "assessing performance deep learning algorithms newsvendor problem. retailer management, newsvendor problem widely attracted attention one basic inventory models. traditional approach solving problem, relies probability distribution demand. theory, probability distribution known, problem considered fully solved. however, real world scenario, almost impossible even approximate estimate better probability distribution demand. recent years, researchers start adopting machine learning approach learn demand prediction model using feature information. paper, propose supervised learning optimizes demand quantities products based feature information. demonstrate original newsvendor loss function training objective outperforms recently suggested quadratic loss function. new algorithm assessed synthetic data real-world data, demonstrating better performance.",19 "learning graph-level representation drug discovery. predicating macroscopic influences drugs human body, like efficacy toxicity, central problem small-molecule based drug discovery. molecules represented undirected graph, utilize graph convolution networks predication molecular properties. however, graph convolutional networks graph neural networks focus learning node-level representation rather graph-level representation. previous works simply sum feature vectors nodes graph obtain graph feature vector drug predication. paper, introduce dummy super node connected nodes graph directed edge representation graph modify graph operation help dummy super node learn graph-level feature. thus, handle graph-level classification regression way node-level classification regression. addition, apply focal loss address class imbalance drug datasets. experiments moleculenet show method effectively improve performance molecular properties predication.",4 "computational ghost imaging using deep learning. computational ghost imaging (cgi) single-pixel imaging technique exploits correlation known random patterns measured intensity light transmitted (or reflected) object. although cgi obtain two- three- dimensional images single bucket detectors, quality reconstructed images reduced noise due reconstruction images random patterns. study, improve quality cgi images using deep learning. deep neural network used automatically learn features noise-contaminated cgi images. training, network able predict low-noise images new noise-contaminated cgi images.",4 "human eye visual hyperacuity: new paradigm sensing?. human eye appears using low number sensors image capturing. furthermore, regarding physical dimensions cones-photoreceptors responsible sharp central vision-, may realize sensors relatively small size area. nonetheless, eye capable obtain high resolution images due visual hyperacuity presents impressive sensitivity dynamic range set conventional digital cameras similar characteristics. article based hypothesis human eye may benefiting diffraction improve image resolution acquisition process. developed method intends explain simulate using matlab software visual hyperacuity: introduction controlled diffraction pattern initial stage, enables use reduced number sensors capturing image makes possible subsequent processing improve final image resolution. results compared outcome equivalent system absence diffraction, achieving promising results. main conclusion work diffraction could helpful capturing images signals small number sensors available, far resolution-limiting factor.",4 "multiscale co-design analysis energy, latency, area, accuracy reram analog neural training accelerator. neural networks increasingly attractive algorithm natural language processing pattern recognition. deep networks >50m parameters made possible modern gpu clusters operating <50 pj per op recently, production accelerators capable <5pj per operation board level. however, slowing cmos scaling, new paradigms required achieve next several orders magnitude performance per watt gains. using analog resistive memory (reram) crossbar perform key matrix operations accelerator attractive option. work presents detailed design using state art 14/16 nm pdk analog crossbar circuit block designed process three key kernels required training inference neural networks. detailed circuit device-level analysis energy, latency, area, accuracy given compared relevant designs using standard digital reram sram operations. shown analog accelerator 270x energy 540x latency advantage similar block utilizing digital reram takes 11 fj per multiply accumulate (mac). compared sram based accelerator, energy 430x better latency 34x better. although training accuracy degraded analog accelerator, several options improve presented. possible gains similar digital-only version accelerator block suggest continued optimization analog resistive memories valuable. detailed circuit device analysis training accelerator may serve foundation architecture-level studies.",4 "detection moving object dynamic background using gaussian max-pooling segmentation constrained rpca. due efficiency stability, robust principal component analysis (rpca) emerging promising tool moving object detection. unfortunately, existing rpca based methods assume static quasi-static background, thereby may trouble coping background scenes exhibit persistent dynamic behavior. work, shall introduce two techniques fill gap. first, instead using raw pixel-value features brittle presence dynamic background, devise so-called gaussian max-pooling operator estimate ""stable-value"" pixel. stable-values robust various background changes therefore distinguish effectively foreground objects background. then, obtain accurate results, propose segmentation constrained rpca (sc-rpca) model, incorporates temporal spatial continuity images rpca. inference process sc-rpca group sparsity constrained nuclear norm minimization problem, convex easy solve. experimental results seven videos cdcnet 2014 database show superior performance proposed method.",4 "monocular depth estimation learning heterogeneous datasets. depth estimation provides essential information perform autonomous driving driver assistance. especially, monocular depth estimation interesting practical point view, since using single camera cheaper many options avoids need continuous calibration strategies required stereo-vision approaches. state-of-the-art methods monocular depth estimation based convolutional neural networks (cnns). promising line work consists introducing additional semantic information traffic scene training cnns depth estimation. practice, means depth data used cnn training complemented images pixel-wise semantic labels, usually difficult annotate (e.g. crowded urban images). moreover, far common practice assume raw training data associated types ground truth, i.e., depth semantic labels. main contribution paper show hard constraint circumvented, i.e., train cnns depth estimation leveraging depth semantic information coming heterogeneous datasets. order illustrate benefits approach, combine kitti depth cityscapes semantic segmentation datasets, outperforming state-of-the-art results monocular depth estimation.",4 "two-phase decision support framework automatic screening digital fundus images. paper give brief review present status automated detection systems describe screening diabetic retinopathy. detail enhanced detection procedure consists two steps. first, pre-screening algorithm considered classify input digital fundus images based severity abnormalities. image found seriously abnormal, analysed robust lesion detector algorithms. improvement, introduce novel feature extraction approach based clinical observations. second step proposed method detects regions interest possible lesions images previously passed pre-screening step. regions serve input specific lesion detectors detailed analysis. procedure increase computational performance screening system. experimental results show two steps proposed approach capable efficiently exclude large amount data processing, thus, decrease computational burden automatic screening system.",4 "learning loss knowledge distillation conditional adversarial networks. increasing interest accelerating neural networks real-time applications. study student-teacher strategy, small fast student network trained auxiliary information provided large accurate teacher network. use conditional adversarial networks learn loss function transfer knowledge teacher student. proposed method particularly effective relatively small student networks. moreover, experimental results show effect network size modern networks used student. empirically study trade-off inference time classification accuracy, provide suggestions choosing proper student.",4 "predicting sla violations real time using online machine learning. detecting faults sla violations timely manner critical telecom providers, order avoid loss business, revenue reputation. time predicting sla violations user services telecom environments difficult, due time-varying user demands infrastructure load conditions. paper, propose service-agnostic online learning approach, whereby behavior system learned fly, order predict client-side sla violations. approach uses device-level metrics, collected streaming fashion server side. results show approach produce highly accurate predictions (>90% classification accuracy < 10% false alarm rate) scenarios sla violations predicted video-on-demand service changing load patterns. paper also highlight limitations traditional offline learning methods, perform significantly worse many considered scenarios.",4 "self-learning camera: autonomous adaptation object detectors unlabeled video streams. learning object detectors requires massive amounts labeled training samples specific data source interest. impractical dealing many different sources (e.g., camera networks), constantly changing ones mobile cameras (e.g., robotics driving assistant systems). paper, address problem self-learning detectors autonomous manner, i.e. (i) detectors continuously updating efficiently adapt streaming data sources (contrary transductive algorithms), (ii) without labeled data strongly related target data stream (contrary self-paced learning), (iii) without manual intervention set update hyper-parameters. end, propose unsupervised, on-line, self-tuning learning algorithm optimize multi-task learning convex objective. method uses confident laconic oracles (high-precision low-recall off-the-shelf generic detectors), exploits structure problem jointly learn on-line ensemble instance-level trackers, derive adapted category-level object detector. approach validated real-world publicly available video object datasets.",4 "convex functional image denoising based patches constrained overlaps vectorial application low dose differential phase tomography. solve image denoising problem dictionary learning technique writing convex functional new form. functional contains beside usual sparsity inducing term fidelity term, new term induces similarity overlapping patches overlap regions. functional depends two free regularization parameters: coefficient multiplying sparsity-inducing $l_{1}$ norm patch basis functions coefficients, coefficient multiplying $l_{2}$ norm differences patches overlapping regions. solution found applying iterative proximal gradient descent method fista acceleration. case tomography reconstruction calculate gradient applying projection solution error backprojection iterative step. study quality solution, function regularization parameters noise, synthetic datas solution a-priori known. apply method experimental data case differential phase tomography. case use original approach consists using vectorial patches, patch two components: one per gradient component. resulting algorithm, implemented esrf tomography reconstruction code pyhst, results robust, efficient, well adapted strongly reduce required dose number projections medical tomography.",12 "clustering multi-way data: novel algebraic approach. paper, develop method unsupervised clustering two-way (matrix) data combining two recent innovations different fields: sparse subspace clustering (ssc) algorithm [10], groups points coming union subspaces respective subspaces, t-product [18], introduced provide matrix-like multiplication third order tensors. algorithm analogous ssc ""affinity"" different data points built using sparse self-representation data. unlike ssc, employ t-product self-representation. allows us flexibility modeling; infact, ssc special case method. using t-product, three-way arrays treated matrices whose elements (scalars) n-tuples tubes. convolutions take place scalar multiplication. framework allows us embed 2-d data vector-space-like structure called free module commutative ring. free modules retain many properties complex inner-product spaces, leverage provide theoretical guarantees algorithm. show compared vector-space counterparts, ssmc achieves higher accuracy better able cluster data less preprocessing image clustering problems. particular show performance proposed method weizmann face database, extended yale b face database mnist handwritten digits database.",4 "multi-horizon solar radiation forecasting mediterranean locations using time series models. considering grid manager's point view, needs terms prediction intermittent energy like photovoltaic resource distinguished according considered horizon: following days (d+1, d+2 d+3), next day hourly step (h+24), next hour (h+1) next minutes (m+5 e.g.). work, identified methodologies using time series models prediction horizon global radiation photovoltaic power. present comparison different predictors developed tested propose hierarchy. horizons d+1 h+1, without advanced ad hoc time series pre-processing (stationarity) find easy differentiate autoregressive moving average (arma) multilayer perceptron (mlp). however observed using exogenous variables improves significantly results mlp . shown mlp adapted horizons h+24 m+5. summary, results complementary improve existing prediction techniques innovative tools: stationarity, numerical weather prediction combination, mlp arma hybridization, multivariate analysis, time index, etc.",15 "feature selection conditional random fields map matching gps trajectories. map matching gps trajectory serves purpose recovering original route road network sequence noisy gps observations. fundamental technique many location based services. however, map matching low sampling rate urban road network still challenging task. paper, characteristics conditional random fields regard inducing many contextual features feature selection explored map matching gps trajectories low sampling rate. experiments taxi trajectory dataset show method may achieve competitive results along success reducing model complexity computation-limited applications.",19 "muprop: unbiased backpropagation stochastic neural networks. deep neural networks powerful parametric models trained efficiently using backpropagation algorithm. stochastic neural networks combine power large parametric functions graphical models, makes possible learn complex distributions. however, backpropagation directly applicable stochastic networks include discrete sampling operations within computational graph, training networks remains difficult. present muprop, unbiased gradient estimator stochastic networks, designed make task easier. muprop improves likelihood-ratio estimator reducing variance using control variate based first-order taylor expansion mean-field network. crucially, unlike prior attempts using backpropagation training stochastic networks, resulting estimator unbiased well behaved. experiments structured output prediction discrete latent variable modeling demonstrate muprop yields consistently good performance across range difficult tasks.",4 "egyptian dialect stopword list generation social network data. paper proposes methodology generating stopword list online social network (osn) corpora egyptian dialect(ed). aim paper investigate effect removinged stopwords sentiment analysis (sa) task. stopwords lists generated modern standard arabic (msa) common language used osn. generated stopword list egyptian dialect used osn corpora. compare efficiency text classification using generated list along previously generated lists msa combining egyptian dialect list msa list. text classification performed using na\""ive bayes decision tree classifiers two feature selection approaches, unigram bigram. experiments show removing ed stopwords give better performance using lists msa stopwords only.",4 "contrast visual saliency similarity induced index image quality assessment. perceptual image quality assessment (iqa) defines/utilizes computational model assess image quality consistent human opinions. good iqa model consider effectiveness efficiency, previous iqa models hard reach simultaneously. attempt make another effort develop effective efficiency image quality assessment metric. considering contrast distinctive visual attribute indicates quality image, visual saliency (vs) attracts attention human visual system, proposed model utilized two features characterize image local quality. obtaining local contrast quality map global visual saliency quality map, add weighted standard deviation previous two quality maps together yield final quality score. experimental results three benchmark database (live, tid2008, csiq) showed proposed model yields best performance terms correlation human judgments visual quality. furthermore, efficient compared competing iqa models.",4 "incident light frequency-based image defogging algorithm. considering problem color distortion caused defogging algorithm based dark channel prior, improved algorithm proposed calculate transmittance channels respectively. first, incident light frequency's effect transmittance various color channels analyzed according beer-lambert's law, proportion among various channel transmittances derived; afterwards, images preprocessed down-sampling refine transmittance, original size restored enhance operational efficiency algorithm; finally, transmittance color channels acquired accordance proportion, corresponding transmittance used image restoration channel. experimental results show compared existing algorithm, improved image defogging algorithm could make image colors natural, solve problem slightly higher color saturation caused existing algorithm, shorten operation time four nine times.",4 "exploiting saliency object segmentation image level labels. remarkable improvements semantic labelling task recent years. however, state art methods rely large-scale pixel-level annotations. paper studies problem training pixel-wise semantic labeller network image-level annotations present object classes. recently, shown high quality seeds indicating discriminative object regions obtained image-level labels. without additional information, obtaining full extent object inherently ill-posed problem due co-occurrences. propose using saliency model additional information hereby exploit prior knowledge object extent image statistics. show combine information sources order recover 80% fully supervised performance - new state art weakly supervised training pixel-wise semantic labelling. code available https://goo.gl/kygseb.",4 "model selection high-dimensional regression generalized irrepresentability condition. high-dimensional regression model response variable linearly related $p$ covariates, sample size $n$ smaller $p$. assume small subset covariates `active' (i.e., corresponding coefficients non-zero), consider model-selection problem identifying active covariates. popular approach estimate regression coefficients lasso ($\ell_1$-regularized least squares). known correctly identify active set irrelevant covariates roughly orthogonal relevant ones, quantified called `irrepresentability' condition. paper study `gauss-lasso' selector, simple two-stage method first solves lasso, performs ordinary least squares restricted lasso active set. formulate `generalized irrepresentability condition' (gic), assumption substantially weaker irrepresentability. prove that, gic, gauss-lasso correctly recovers active set.",12 "reconstructing neural parameters synapses arbitrary interconnected neurons simulated spiking activity. understand behavior neural circuit presupposition model dynamical system describing circuit. model determined several parameters, including synaptic weights, also parameters neuron. existing works mainly concentrate either synaptic weights neural parameters. paper present algorithm reconstruct parameters including synaptic weights spiking neuron model. model based works eugene m. izhikevich (izhikevich 2007) consists two differential equations covers different types cortical neurons. combines dynamical properties hodgkin-huxley-type dynamics high computational efficiency. presented algorithm uses recordings corresponding membrane potentials model reconstruction consists two main components. first component rank based genetic algorithm (ga) used find neural parameters model. second one least mean squares approach computes synaptic weights interconnected neurons minimizing squared error calculated measured membrane potentials time step. preparation reconstruction neural parameters synaptic weights real measured membrane potentials, promising results based simulated data generated randomly parametrized izhikevich model presented. reconstruction converge global minimum neural parameters, also approximates synaptic weights high precision.",4 "biologically inspired radio signal feature extraction sparse denoising autoencoders. automatic modulation classification (amc) important task modern communication systems; however, challenging problem signal features precise models generating modulation may unknown. present new biologically-inspired amc method without need models manually specified features --- thus removing requirement expert prior knowledge. accomplish task using regularized stacked sparse denoising autoencoders (ssdas). method selects efficient classification features directly raw in-phase/quadrature (i/q) radio signals unsupervised manner. features used construct higher-complexity abstract features used automatic modulation classification. demonstrate process using dataset generated software defined radio, consisting random input bits encoded 100-sample segments various common digital radio modulations. results show correct classification rates > 99% 7.5 db signal-to-noise ratio (snr) > 92% 0 db snr 6-way classification test. experiments demonstrate dramatically new broadly applicable mechanism performing amc related tasks without need expert-defined modulation-specific signal information.",19 "modular autoencoders ensemble feature extraction. introduce concept modular autoencoder (mae), capable learning set diverse complementary representations unlabelled data, later used supervised tasks. learning representations controlled trade parameter, show six benchmark datasets optimum lies two extremes: set smaller, independent autoencoders low capacity, versus single monolithic encoding, outperforming appropriate baseline. present paper explore special case linear mae, derive svd-based algorithm converges several orders magnitude faster gradient descent.",4 "trex: tomography reconstruction proximal framework robust sparse view x-ray applications. present trex, flexible robust tomographic reconstruction framework using proximal algorithms. provide overview perform experimental comparison famous iterative reconstruction methods terms reconstruction quality sparse view situations. derive proximal operators four best methods. show flexibility framework deriving solvers two noise models: gaussian poisson; plugging three powerful regularizers. compare framework state art methods, show superior quality synthetic real datasets.",12 "learning dtw global constraint time series classification. 1-nearest neighbor dynamic time warping (dtw) distance one effective classifiers time series domain. since global constraint introduced speech community, many global constraint models proposed including sakoe-chiba (s-c) band, itakura parallelogram, ratanamahatana-keogh (r-k) band. r-k band general global constraint model represent global constraints arbitrary shape size effectively. however, need good learning algorithm discover suitable set r-k bands, current r-k band learning algorithm still suffers 'overfitting' phenomenon. paper, propose two new learning algorithms, i.e., band boundary extraction algorithm iterative learning algorithm. band boundary extraction calculated bound possible warping paths class, iterative learning adjusted original r-k band learning. also use silhouette index, well-known clustering validation technique, heuristic function, lower bound function, lb_keogh, enhance prediction speed. twenty datasets, workshop challenge time series classification, held conjunction sigkdd 2007, used evaluate approach.",4 "confidence estimation structured prediction. structured classification tasks sequence labeling dependency parsing seen much interest natural language processing machine learning communities. several online learning algorithms adapted structured tasks perceptron, passive- aggressive recently introduced confidence-weighted learning . online algorithms easy implement, fast train yield state-of-the-art performance. however, unlike probabilistic models like hidden markov model conditional random fields, methods generate models output merely prediction additional information regarding confidence correctness output. work fill gap proposing alternatives compute confidence output non-probabilistic algorithms.we show compute confidence estimates prediction confidence reflects probability word labeled correctly. show use methods detect mislabeled words, trade recall precision active learning. evaluate methods four noun-phrase chunking named entity recognition sequence labeling tasks, dependency parsing 14 languages.",4 "dependent landmark drift: robust point set registration based gaussian mixture model statistical shape model. goal point set registration find point-by-point correspondences point sets, characterizes shape object. local preservation object geometry assumed, prevalent algorithms area often elegantly solve problems without using geometric information specific objects. means registration performance improved using prior knowledge object geometry. paper, propose novel point set registration method using gaussian mixture model prior shape information encoded statistical shape model. transformation model defined combination similar transformation, motion coherence, statistical shape model. therefore, proposed method works effectively target point set includes outliers missing regions, rotated. computational cost reduced linear, therefore method scalable large point sets. effectiveness method verified comparisons existing algorithms using datasets concerning human body shapes, hands, faces.",4 "deep reinforcement learning-based image captioning embedding reward. image captioning challenging problem owing complexity understanding image content diverse ways describing natural language. recent advances deep neural networks substantially improved performance task. state-of-the-art approaches follow encoder-decoder framework, generates captions using sequential recurrent prediction model. however, paper, introduce novel decision-making framework image captioning. utilize ""policy network"" ""value network"" collaboratively generate captions. policy network serves local guidance providing confidence predicting next word according current state. additionally, value network serves global lookahead guidance evaluating possible extensions current state. essence, adjusts goal predicting correct words towards goal generating captions similar ground truth captions. train networks using actor-critic reinforcement learning model, novel reward defined visual-semantic embedding. extensive experiments analyses microsoft coco dataset show proposed framework outperforms state-of-the-art approaches across different evaluation metrics.",4 "words use lie?: word choice deceptive messages. text messaging widely used form computer- mediated communication (cmc). previous findings shown linguistic factors reliably indicate messages deceptive. example, users take longer use words craft deceptive messages truthful messages. existing research also examined factors, student status gender, affect rates deception word choice deceptive messages. however, research limited small sample sizes returned contradicting findings. paper aims address issues using dataset text messages collected large varied set participants using android messaging application. results paper show significant differences word choice frequency deceptive messages male female participants, well students non-students.",4 "stream computing. stream computing use multiple autonomic parallel modules together integrative processors higher level abstraction embody ""intelligent"" processing. biological basis computing sketched matter learning examined.",4 "detecting collusive cliques futures markets based trading behaviors real data. financial markets, abnormal trading behaviors pose serious challenge market surveillance risk management. worse, increasing emergence abnormal trading events experienced traders constitute collusive clique collaborate manipulate instruments, thus mislead investors applying similar trading behaviors maximizing personal benefits. paper, method proposed detect hidden collusive cliques involved instrument future markets first calculating correlation coefficient two eligible unified aggregated time series signed order volume, combining connected components multiple sparsified weighted graphs constructed using correlation matrices correlation coefficient user-specified threshold. experiments conducted real order data shanghai futures exchange show proposed method effectively detect suspect collusive cliques. tool based proposed method deployed exchange pilot application futures market surveillance risk management.",17 "framework control strategies uncertain inference networks. control strategies hierarchical tree-like probabilistic inference networks formulated investigated. strategies utilize staged look-ahead temporary focus subgoals formalized refined using depth vector concept serves tool defining 'virtual tree' regarded control strategy. concept illustrated four types control strategies three-level trees characterized according depth vector, according way consider intermediate nodes role let nodes play. inferenti computerized inference system written prolog, provides tools exercising variety control strategies. system also provides tools simulating test data comparing relative average performance different strategies.",4 "self-adaptive exploration evolutionary search. address primary question computational well biological research evolution: exploration strategy adapt way exploit information gained problem hand? first introduce integrated formalism evolutionary search provides unified view different specific approaches. basis discuss implications indirect modeling (via ``genotype-phenotype mapping'') exploration strategy. notions modularity, pleiotropy functional phenotypic complex discussed implications. then, rigorously reflecting notion self-adaptability, introduce new definition captures self-adaptability exploration: different genotypes map phenotype may represent (also topologically) different exploration strategies; self-adaptability requires variation exploration strategies along ``neutral space''. definition, concept neutrality becomes central concern paper. finally, present examples concepts: specific grammar-type encoding, observe large variability exploration strategies fixed phenotype, self-adaptive drift towards short representations highly structured exploration strategy matches ``problem's structure''.",15 "real-time interactive sequence generation control recurrent neural network ensembles. recurrent neural networks (rnn), particularly long short term memory (lstm) rnns, popular successful method learning generating sequences. however, current generative rnn techniques allow real-time interactive control sequence generation process, thus well suited live creative expression. propose method real-time continuous control 'steering' sequence generation using ensemble rnns dynamically altering mixture weights models. demonstrate method using character based lstm networks gestural interface allowing users 'conduct' generation text.",4 discriminative metric learning deep forest. discriminative deep forest (disdf) metric learning algorithm proposed paper. based deep forest gcforest proposed zhou feng viewed gcforest modification. case fully supervised learning studied class labels individual training examples known. main idea underlying algorithm assign weights decision trees random forest order reduce distances objects class increase objects different classes. weights training parameters. specific objective function combines euclidean manhattan distances simplifies optimization problem training disdf proposed. numerical experiments illustrate proposed distance metric algorithm.,19 "consciousness pattern recognition. proof strong ai hypothesis, i.e. machines conscious. phenomenological proof pattern-recognition subjective consciousness activity different terms. therefore, proves essential subjective processes consciousness computable, identifies significant traits requirements conscious system. since husserl, many philosophers accepted consciousness consists memories logical connections ego external objects. connections called ""intentions."" pattern recognition systems achievable technical artifacts. proof links respected introspective philosophical theory consciousness technical art. proof therefore endorses strong ai hypothesis may therefore also enable theoretically-grounded form artificial intelligence called ""synthetic intentionality,"" able synthesize, generalize, select repeat intentions. pattern recognition reflexive, able operate set intentions, flexible, several methods synthesizing intentions, si may particularly strong form ai. similarities possible applications several ai paradigms discussed. article addresses problems: proof's limitations, reflexive cognition, searles' chinese room, si could ""understand"" ""meanings"" ""be creative.""",4 "matrix calculus need deep learning. paper attempt explain matrix calculus need order understand training deep neural networks. assume math knowledge beyond learned calculus 1, provide links help refresh necessary math needed. note need understand material start learning train use deep learning practice; rather, material already familiar basics neural networks, wish deepen understanding underlying math. worry get stuck point along way---just go back reread previous section, try writing working examples. still stuck, we're happy answer questions theory category forums.fast.ai. note: reference section end paper summarizing key matrix calculus rules terminology discussed here.",4 "fast power system security analysis guided dropout. propose new method efficiently compute load-flows (the steady-state power-grid given productions, consumptions grid topology), substituting conventional simulators based differential equation solvers. use deep feed-forward neural network trained load-flows precomputed simulation. architecture permits train network so-called ""n-1"" problems, load flows evaluated every possible line disconnection, generalize ""n-2"" problems without retraining (a clear advantage combinatorial nature problem). end, developed technique bearing similarity ""dropout"", named ""guided dropout"".",19 "texture descriptor combining fractal dimension artificial crawlers. texture important visual attribute used describe images. many methods available texture analysis. however, capture details richness image surface. paper, propose new method describe textures using artificial crawler model. model assumes agent interact environment other. since swarm system alone achieve good discrimination, developed new method increase discriminatory power artificial crawlers, together fractal dimension theory. here, estimated fractal dimension bouligand-minkowski method due precision quantifying structural properties images. validate method two texture datasets experimental results reveal method leads highly discriminative textural features. results indicate method used different texture applications.",15 "automatic differentiation algorithms machine learning. automatic differentiation---the mechanical transformation numeric computer programs calculate derivatives efficiently accurately---dates origin computer age. reverse mode automatic differentiation antedates generalizes method backwards propagation errors used machine learning. despite this, practitioners variety fields, including machine learning, little influenced automatic differentiation, make scant use available tools. review technique automatic differentiation, describe two main modes, explain benefit machine learning practitioners. reach widest possible audience treatment assumes elementary differential calculus, assume knowledge linear algebra.",4 "avoiding confusion predictors inhibitors value function approximation. reinforcement learning, goal seek rewards avoid punishments. simple scalar captures value state taking action, expected future rewards increase punishments decrease quantity. naturally agent learn predict quantity take beneficial actions, many value function approximators exist purpose. present work, however, show value function approximators cause confusion predictors outcome one valence (e.g., signal reward) inhibitor opposite valence (e.g., signal canceling expectation punishment). show problem linear non-linear value function approximators, especially amount data (or experience) limited. propose evaluate simple resolution: instead predict reward punishment values separately, rectify add get value needed decision making. evaluate several function approximators slightly different value function approximation architecture show approach able circumvent confusion thereby achieve lower value-prediction errors.",4 "variational bayesian approach image restoration. application image deblurring poisson-gaussian noise. paper, methodology investigated signal recovery presence non-gaussian noise. contrast regularized minimization approaches often adopted literature, algorithm regularization parameter reliably estimated observations. posterior density unknown parameters analytically intractable, estimation problem derived variational bayesian framework goal provide good approximation posterior distribution order compute posterior mean estimates. moreover, majorization technique employed circumvent difficulties raised intricate forms non-gaussian likelihood prior density. demonstrate potential proposed approach comparisons state-of-the-art techniques specifically tailored signal recovery presence mixed poisson-gaussian noise. results show proposed approach efficient achieves performance comparable methods regularization parameter manually tuned ground truth.",12 "resnet resnet: generalizing residual architectures. residual networks (resnets) recently achieved state-of-the-art challenging computer vision tasks. introduce resnet resnet (rir): deep dual-stream architecture generalizes resnets standard cnns easily implemented computational overhead. rir consistently improves performance resnets, outperforms architectures similar amounts augmentation cifar-10, establishes new state-of-the-art cifar-100.",4 "learning cluster order transfer across domains tasks. paper introduces novel method perform transfer learning across domains tasks, formulating problem learning cluster. key insight that, addition features, transfer similarity information sufficient learn similarity function clustering network perform domain adaptation cross-task transfer learning. begin reducing categorical information pairwise constraints, considers whether two instances belong class not. similarity category-agnostic learned data source domain using similarity network. present two novel approaches performing transfer learning using similarity function. first, unsupervised domain adaptation, design new loss function regularize classification constrained clustering loss, hence learning clustering network transferred similarity metric generating training inputs. second, cross-task learning (i.e., unsupervised clustering unseen categories), propose framework reconstruct estimate number semantic clusters, using clustering network. since similarity network noisy, key use robust clustering algorithm, show formulation robust alternative constrained unconstrained clustering approaches. using method, first show state art results challenging cross-task problem, applied omniglot imagenet. results show reconstruct semantic clusters high accuracy. evaluate performance cross-domain transfer using images office-31 svhn-mnist tasks present top accuracy datasets. approach explicitly deal domain discrepancy. combine domain adaptation loss, shows improvement.",4 "riemannian dictionary learning sparse coding positive definite matrices. data encoded symmetric positive definite (spd) matrices frequently arise many areas computer vision machine learning. matrices form open subset euclidean space symmetric matrices, viewing lens non-euclidean riemannian geometry often turns better suited capturing several desirable data properties. however, formulating classical machine learning algorithms within geometry often non-trivial computationally expensive. inspired great success dictionary learning sparse coding vector-valued data, goal paper represent data form spd matrices sparse conic combinations spd atoms learned dictionary via riemannian geometric approach. end, formulate novel riemannian optimization objective dictionary learning sparse coding representation loss characterized via affine invariant riemannian metric. also present computationally simple algorithm optimizing model. experiments several computer vision datasets demonstrate superior classification retrieval performance using approach compared sparse coding via alternative non-riemannian formulations.",4 "non-local graph-based prediction reversible data hiding images. reversible data hiding (rdh) desirable applications hidden message cover medium need recovered without loss. among many rdh approaches prediction-error expansion (pee), containing two steps: i) prediction target pixel value, ii) embedding according value prediction-error. general, higher prediction performance leads larger embedding capacity and/or lower signal distortion. leveraging recent advances graph signal processing (gsp), pose pixel prediction graph-signal restoration problem, appropriate edge weights underlying graph computed using similar patch searched semi-local neighborhood. specifically, candidate patch, first examine eigenvalues structure tensor estimate local smoothness. sufficiently smooth, pose maximum posteriori (map) problem using either quadratic laplacian regularizer graph total variation (gtv) term signal prior. map problem using first prior closed-form solution, design efficient algorithm second prior using alternating direction method multipliers (admm) nested proximal gradient descent. experimental results show better quality gsp-based prediction, low capacity visual quality embedded image exceeds state-of-the-art methods noticeably.",6 "non-backtracking spectrum degree-corrected stochastic block models. motivated community detection, characterise spectrum non-backtracking matrix $b$ degree-corrected stochastic block model. specifically, consider random graph $n$ vertices partitioned two equal-sized clusters. vertices i.i.d. weights $\{ \phi_u \}_{u=1}^n$ second moment $\phi^{(2)}$. intra-cluster connection probability vertices $u$ $v$ $\frac{\phi_u \phi_v}{n}a$ inter-cluster connection probability $\frac{\phi_u \phi_v}{n}b$. show high probability, following holds: leading eigenvalue non-backtracking matrix $b$ asymptotic $\rho = \frac{a+b}{2} \phi^{(2)}$. second eigenvalue asymptotic $\mu_2 = \frac{a-b}{2} \phi^{(2)}$ $\mu_2^2 > \rho$, asymptotically bounded $\sqrt{\rho}$ $\mu_2^2 \leq \rho$. remaining eigenvalues asymptotically bounded $\sqrt{\rho}$. result, clustering positively-correlated true communities obtained based second eigenvector $b$ regime $\mu_2^2 > \rho.$ previous work obtained detection impossible $\mu_2^2 < \rho,$ meaning occurs phase-transition sparse regime degree-corrected stochastic block model. corollary, obtain degree-corrected erd\h{o}s-r\'enyi graphs asymptotically satisfy graph riemann hypothesis, quasi-ramanujan property. by-product proof weak law large numbers local-functionals degree-corrected stochastic block models, could independent interest.",12 "sgd hogwild! convergence without bounded gradients assumption. stochastic gradient descent (sgd) optimization algorithm choice many machine learning applications regularized empirical risk minimization training deep neural networks. classical analysis convergence sgd carried assumption norm stochastic gradient uniformly bounded. might hold loss functions, always violated cases objective function strongly convex. (bottou et al.,2016) new analysis convergence sgd performed assumption stochastic gradients bounded respect true gradient norm. show stochastic problems arising machine learning bound always holds. moreover, propose alternative convergence analysis sgd diminishing learning rate regime, results relaxed conditions (bottou et al.,2016). move asynchronous parallel setting, prove convergence hogwild! algorithm regime, obtaining first convergence results method case diminished learning rate.",12 "hyperspectral unmixing endmember variability using semi-supervised partial membership latent dirichlet allocation. semi-supervised partial membership latent dirichlet allocation approach developed hyperspectral unmixing endmember estimation accounting spectral variability spatial information. partial membership latent dirichlet allocation effective approach spectral unmixing representing spectral variability leveraging spatial information. work, extend partial membership latent dirichlet allocation incorporate available (imprecise) label information help guide unmixing. experimental results two hyperspectral datasets show proposed semi-supervised pm-lda yield improved hyperspectral unmixing endmember estimation results.",4 "room improvement automatic image description: error analysis. recent years seen rapid significant progress automatic image description open problems area? work evaluated using text-based similarity metrics, indicate improvements, without explaining improved. paper, present detailed error analysis descriptions generated state-of-the-art attention-based model. analysis operates two levels: first check descriptions accuracy, categorize types errors observe inaccurate descriptions. find 20% descriptions free errors, surprisingly 26% unrelated image. finally, manually correct frequently occurring error types (e.g. gender identification) estimate performance reward addressing errors, observing gains 0.2--1 bleu point per type.",4 "learning multiple tasks multilinear relationship networks. deep networks trained large-scale data learn transferable features promote learning multiple tasks. since deep features eventually transition general specific along deep networks, fundamental problem multi-task learning exploit task relatedness underlying parameter tensors improve feature transferability multiple task-specific layers. paper presents multilinear relationship networks (mrn) discover task relationships based novel tensor normal priors parameter tensors multiple task-specific layers deep convolutional networks. jointly learning transferable features multilinear relationships tasks features, mrn able alleviate dilemma negative-transfer feature layers under-transfer classifier layer. experiments show mrn yields state-of-the-art results three multi-task learning datasets.",4 "dfacto: distributed factorization tensors. present technique significantly speeding alternating least squares (als) gradient descent (gd), two widely used algorithms tensor factorization. exploiting properties khatri-rao product, show efficiently address computationally challenging sub-step algorithms. algorithm, dfacto, requires two sparse matrix-vector products easy parallelize. dfacto scalable also average 4 10 times faster competing algorithms variety datasets. instance, dfacto takes 480 seconds 4 machines perform one iteration als algorithm 1,143 seconds perform one iteration gd algorithm 6.5 million x 2.5 million x 1.5 million dimensional tensor 1.2 billion non-zero entries.",19 "vpgnet: vanishing point guided network lane road marking detection recognition. paper, propose unified end-to-end trainable multi-task network jointly handles lane road marking detection recognition guided vanishing point adverse weather conditions. tackle rainy low illumination conditions, extensively studied due clear challenges. example, images taken rainy days subject low illumination, wet roads cause light reflection distort appearance lane road markings. night, color distortion occurs limited illumination. result, benchmark dataset exists developed algorithms work poor weather conditions. address shortcoming, build lane road marking benchmark consists 20,000 images 17 lane road marking classes four different scenarios: rain, rain, heavy rain, night. train evaluate several versions proposed multi-task network validate importance task. resulting approach, vpgnet, detect classify lanes road markings, predict vanishing point single forward pass. experimental results show approach achieves high accuracy robustness various conditions real-time (20 fps). benchmark vpgnet model publicly available.",4 "boltzmann machines energy-based models. review boltzmann machines energy-based models. boltzmann machine defines probability distribution binary-valued patterns. one learn parameters boltzmann machine via gradient based approaches way log likelihood data increased. gradient laplacian boltzmann machine admit beautiful mathematical representations, although computing general intractable. intractability motivates approximate methods, including gibbs sampler contrastive divergence, tractable alternatives, namely energy-based models.",4 "multi-task convolutional neural network pose-invariant face recognition. paper explores multi-task learning (mtl) face recognition. answer questions mtl improve face recognition performance. first, propose multi-task convolutional neural network (cnn) face recognition identity classification main task pose, illumination, expression estimations side tasks. second, develop dynamic-weighting scheme automatically assign loss weight side task, crucial problem mtl. third, propose pose-directed multi-task cnn grouping different poses learn pose-specific identity features, simultaneously across poses. last least, propose energy-based weight analysis method explore cnn-based mtl works. observe side tasks serve regularizations disentangle variations learnt identity features. extensive experiments entire multi-pie dataset demonstrate effectiveness proposed approach. best knowledge, first work using data multi-pie face recognition. approach also applicable in-the-wild datasets pose-invariant face recognition achieves comparable better performance state art lfw, cfp, ijb-a datasets.",4 "sparse image representation discrete cosine/spline based dictionaries. mixed dictionaries generated cosine b-spline functions considered. shown that, highly nonlinear approaches orthogonal matching pursuit, discrete version proposed dictionaries yields significant gain sparsity image representation.",12 "complexity manipulating $k$-approval elections. important problem computational social choice theory complexity undesirable behavior among agents, control, manipulation, bribery election systems. kinds voting strategies often tempting individual level disastrous agents whole. creating election systems determination strategies difficult thus important goal. interesting set elections scoring protocols. previous work area demonstrated complexity misuse cases involving fixed number candidates, specific election systems unbounded number candidates borda. contrast, take first step generalizing results computational complexity election misuse cases infinitely many scoring protocols unbounded number candidates. interesting families systems include $k$-approval $k$-veto elections, voters distinguish $k$ candidates candidate set. main result partition problems families based complexity. showing polynomial-time computable, np-hard, polynomial-time equivalent another problem interest. also demonstrate surprising connection manipulation election systems graph theory problems.",4 "exploiting convolutional representations multiscale human settlement detection. test premise explore representation spaces single deep convolutional network visualization argue novel unified feature extraction framework. objective utilize re-purpose trained feature extractors without need network retraining three remote sensing tasks i.e. superpixel mapping, pixel-level segmentation semantic based image visualization. leveraging convolutional feature extractors viewing visual information extractors encode different image representation spaces, demonstrate preliminary inductive transfer learning potential multiscale experiments incorporate edge-level details semantic-level information.",4 "privlogit: efficient privacy-preserving logistic regression tailoring numerical optimizers. safeguarding privacy machine learning highly desirable, especially collaborative studies across many organizations. privacy-preserving distributed machine learning (based cryptography) popular solve problem. however, existing cryptographic protocols still incur excess computational overhead. here, make novel observation partially due naive adoption mainstream numerical optimization (e.g., newton method) failing tailor secure computing. work presents contrasting perspective: customizing numerical optimization specifically secure settings. propose seemingly less-favorable optimization method fact significantly accelerate privacy-preserving logistic regression. leveraging new method, propose two new secure protocols conducting logistic regression privacy-preserving distributed manner. extensive theoretical empirical evaluations prove competitive performance two secure proposals without compromising accuracy privacy: speedup 2.3x 8.1x, respectively, state-of-the-art; even faster data scales up. drastic speedup top addition performance improvements existing (and future) state-of-the-art cryptography. work provides new way towards efficient practical privacy-preserving logistic regression large-scale studies common modern science.",4 "consistent on-line off-policy evaluation. problem on-line off-policy evaluation (ope) actively studied last decade due importance stand-alone problem module policy improvement scheme. however, temporal difference (td) based solutions ignore discrepancy stationary distribution behavior target policies effect convergence limit function approximation applied. paper propose consistent off-policy temporal difference (cop-td($\lambda$, $\beta$)) algorithm addresses issue reduces bias computational expense. show cop-td($\lambda$, $\beta$) designed converge value would obtained using on-policy td($\lambda$) target policy. subsequently, proposed scheme leads related promising heuristic call log-cop-td($\lambda$, $\beta$). algorithms favorable empirical results current state art on-line ope algorithms. finally, formulation sheds new light recently proposed emphatic td learning.",19 "fast 5dof needle tracking ioct. purpose. intraoperative optical coherence tomography (ioct) increasingly available imaging technique ophthalmic microsurgery provides high-resolution cross-sectional information surgical scene. propose build desirable qualities present method tracking orientation location surgical needle. thereby, enable direct analysis instrument-tissue interaction directly oct space without complex multimodal calibration would required traditional instrument tracking methods. method. intersection needle ioct scan detected peculiar multi-step ellipse fitting takes advantage directionality modality. geometric modelling allows us use ellipse parameters provide latency aware estimator infer 5dof pose needle movement. results. experiments phantom data ex-vivo porcine eyes indicate algorithm retains angular precision especially lateral needle movement provides robust consistent estimation baseline methods. conclusion. using solely crosssectional ioct information, able successfully robustly estimate 5dof pose instrument less 5.5 ms cpu.",4 "missing found recognition system hajj umrah. note describes integrated recognition system identifying missing found objects well missing, dead, found people hajj umrah seasons two holy cities makkah madina kingdom saudi arabia. assumed total estimated number pilgrims reach 20 millions next decade. ultimate goal system integrate facial recognition object identification solutions hajj umrah rituals. missing found computerized system part crowdsensing system hajj umrah crowd estimation, management safety.",4 "margin-based feed-forward neural network classifiers. margin-based principle proposed long time, proved principle could reduce structural risk improve performance theoretical practical aspects. meanwhile, feed-forward neural network traditional classifier, hot present deeper architecture. however, training algorithm feed-forward neural network developed generated widrow-hoff principle means minimize squared error. paper, propose new training algorithm feed-forward neural networks based margin-based principle, could effectively promote accuracy generalization ability neural network classifiers less labelled samples flexible network. conducted experiments four uci open datasets achieved good results expected. conclusion, model could handle sparse labelled high-dimension dataset high accuracy modification old ann method method easy almost free work.",4 "cells islands: unified model cellular parallel genetic algorithms. paper presents anisotropic selection scheme cellular genetic algorithms (cga). new scheme allows enhance diversity control selective pressure two important issues genetic algorithms, especially trying solve difficult optimization problems. varying anisotropic degree selection allows swapping cellular island model parallel genetic algorithm. measures performances diversity performed one well-known problem: quadratic assignment problem known difficult optimize. experiences show that, tuning anisotropic degree, find accurate trade-off cga island models optimize performances parallel evolutionary algorithms. trade-off interpreted suitable degree migration among subpopulations parallel genetic algorithm.",4 "hierarchical compositional feature learning. introduce hierarchical compositional network (hcn), directed generative model able discover disentangle, without supervision, building blocks set binary images. building blocks binary features defined hierarchically composition features layer immediately below, arranged particular manner. high level, hcn similar sigmoid belief network pooling. inference learning hcn challenging existing variational approximations work satisfactorily. main contribution work show addressed using max-product message passing (mpmp) particular schedule (no em required). also, using mpmp inference engine hcn makes new tasks simple: adding supervision information, classifying images, performing inpainting correspond clamping variables model known values running mpmp rest. used classification, fast inference hcn exactly functional form convolutional neural network (cnn) linear activations binary weights. however, hcn's features qualitatively different.",4 "near-infrared image dehazing via color regularization. near-infrared imaging capture haze-free near-infrared gray images visible color images, according physical scattering models, e.g., rayleigh mie models. however, exist serious discrepancies brightness image structures near-infrared gray images visible color images. direct use near-infrared gray images brings another color distortion problem dehazed images. therefore, color distortion also considered near-infrared dehazing. reflect point, paper presents approach adding new color regularization conventional dehazing framework. proposed color regularization model color prior unknown haze-free images two captured images. thus, natural-looking colors fine details induced dehazed images. experimental results show proposed color regularization model help remove color distortion haze time. also, effectiveness proposed color regularization verified comparing conventional regularizations. also shown proposed color regularization remove edge artifacts arise use conventional dark prior model.",4 "opus: efficient admissible algorithm unordered search. opus branch bound search algorithm enables efficient admissible search spaces order search operator application significant. algorithm's search efficiency demonstrated respect large machine learning search spaces. use admissible search potential value machine learning community means exact learning biases employed complex learning tasks precisely specified manipulated. opus also potential application areas artificial intelligence, notably, truth maintenance.",4 "sine cosine crow search algorithm: powerful hybrid meta heuristic global optimization. paper presents novel hybrid algorithm named since cosine crow search algorithm. propose sccsa, two novel algorithms considered including crow search algorithm (csa) since cosine algorithm (sca). advantages two algorithms considered utilize design efficient hybrid algorithm perform significantly better various benchmark functions. combination concept operators two algorithms enable sccsa make appropriate trade-off exploration exploitation abilities algorithm. evaluate performance proposed sccsa, seven well-known benchmark functions utilized. results indicated proposed hybrid algorithm able provide competitive solution comparing state-of-the-art meta heuristics.",4 "convex model non-negative matrix factorization dimensionality reduction physical space. collaborative convex framework factoring data matrix $x$ non-negative product $as$, sparse coefficient matrix $s$, proposed. restrict columns dictionary matrix $a$ coincide certain columns data matrix $x$, thereby guaranteeing physically meaningful dictionary dimensionality reduction. use $l_{1,\infty}$ regularization select dictionary data show leads exact convex relaxation $l_0$ case distinct noise free data. also show relax restriction-to-$x$ constraint initializing alternating minimization approach solution convex model, obtaining dictionary close necessarily $x$. focus applications proposed framework hyperspectral endmember abundances identification also show application blind source separation nmr data.",19 "verifying platform digital imaging: multi-tool strategy. fiji java platform widely used biologists experimental scientists process digital images. particular, research - made together biologists team; use fiji pre-processing steps undertaking homological digital processing images. previous work, formalised correctness programs use homological techniques analyse digital images. however, verification fiji's pre-processing step missed. paper, present multi-tool approach filling gap, based combination why/krakatoa, coq acl2.",4 fusion multispectral satellite imagery using cluster graphics processing unit. paper presents parallel implementation existing image fusion methods graphical cluster. parallel implementations methods based discrete wavelet transformation (haars daubechies discrete wavelet transform) developed. experiments performed cluster using gpu cpu performance gains estimated use developed parallel implementations process satellite images satellite landsat 7. implementation graphic cluster provides performance improvement 2 18 times. quality considered methods evaluated ergas qnr metrics. results show performance gains retaining quality cluster gpu compared results obtained authors researchers cpu single gpu.,4 "modeling discrete interventional data using directed cyclic graphical models. outline representation discrete multivariate distributions terms interventional potential functions globally normalized. representation used model effects interventions, independence properties encoded model represented directed graph allows cycles. addition discussing inference sampling representation, give exponential family parametrization allows parameter estimation stated convex optimization problem; also give convex relaxation task simultaneous parameter structure learning using group l1-regularization. model evaluated simulated data intracellular flow cytometry data.",19 "automated software vulnerability detection machine learning. thousands security vulnerabilities discovered production software year, either reported publicly common vulnerabilities exposures database discovered internally proprietary code. vulnerabilities often manifest subtle ways obvious code reviewers developers themselves. wealth open source code available analysis, opportunity learn patterns bugs lead security vulnerabilities directly data. paper, present data-driven approach vulnerability detection using machine learning, specifically applied c c++ programs. first compile large dataset hundreds thousands open-source functions labeled outputs static analyzer. compare methods applied directly source code methods applied artifacts extracted build process, finding source-based models perform better. also compare application deep neural network models traditional models random forests find best performance comes combining features learned deep models tree-based models. ultimately, highest performing model achieves area precision-recall curve 0.49 area roc curve 0.87.",4 "topicrnn: recurrent neural network long-range semantic dependency. paper, propose topicrnn, recurrent neural network (rnn)-based language model designed directly capture global semantic meaning relating words document via latent topics. sequential nature, rnns good capturing local structure word sequence - semantic syntactic - might face difficulty remembering long-range dependencies. intuitively, long-range dependencies semantic nature. contrast, latent topic models able capture global underlying semantic structure document account word ordering. proposed topicrnn model integrates merits rnns latent topic models: captures local (syntactic) dependencies using rnn global (semantic) dependencies using latent topics. unlike previous work contextual rnn language modeling, model learned end-to-end. empirical results word prediction show topicrnn outperforms existing contextual rnn baselines. addition, topicrnn used unsupervised feature extractor documents. sentiment analysis imdb movie review dataset report error rate $6.28\%$. comparable state-of-the-art $5.91\%$ resulting semi-supervised approach. finally, topicrnn also yields sensible topics, making useful alternative document models latent dirichlet allocation.",4 "correction method binary classifier applied multi-label pairwise models. work, addressed issue applying stochastic classifier local, fuzzy confusion matrix framework multi-label classification. proposed novel solution problem correcting label pairwise ensembles. main step correction procedure compute classifier- specific competence cross-competence measures, estimates error pattern underlying classifier. considered two improvements method obtaining confusion matrices. first one aimed deal imbalanced labels. utilizes double labelled instances usually removed pairwise transformation. proposed methods evaluated using 29 benchmark datasets. order assess efficiency introduced models, compared 1 state-of-the-art approach correction scheme based original method confusion matrix estimation. comparison performed using four different multi-label evaluation measures: macro micro-averaged f1 loss, zero-one loss hamming loss. additionally, investigated relations classification quality, expressed terms different quality criteria, characteristics multi-label datasets average imbalance ratio label density. experimental study reveals correction approaches significantly outperforms reference method terms zero-one loss.",4 "anomaly detection xml-structured soap messages using tree-based association rule mining. web services software systems designed supporting interoperable dynamic cross-enterprise interactions. result attacks web services catastrophic causing disclosure enterprises' confidential data. new approaches attacking arise every day, anomaly detection systems seem invaluable tools context. aim work target attacks reside web service layer extensible markup language (xml)-structured simple object access protocol (soap) messages. studying shortcomings existing solutions, new approach detecting anomalies web services outlined. specifically, proposed technique illustrates identify anomalies employing mining methods xml-structured soap messages. technique also takes advantages tree-based association rule mining extract knowledge training phase, used test phase detect anomalies. addition, novel composition techniques brings nearly low false alarm rate maintaining detection rate reasonably high, shown case study.",4 "semantic knowledge graph: compact, auto-generated model real-time traversal ranking relationship within domain. paper describes new kind knowledge representation mining system calling semantic knowledge graph. heart, semantic knowledge graph leverages inverted index, along complementary uninverted index, represent nodes (terms) edges (the documents within intersecting postings lists multiple terms/nodes). provides layer indirection pair nodes corresponding edge, enabling edges materialize dynamically underlying corpus statistics. result, combination nodes edges nodes materialize scored reveal latent relationships nodes. provides numerous benefits: knowledge graph built automatically real-world corpus data, new nodes - along combined edges - instantly materialized arbitrary combination preexisting nodes (using set operations), full model semantic relationships entities within domain represented dynamically traversed using highly compact representation graph. system widespread applications areas diverse knowledge modeling reasoning, natural language processing, anomaly detection, data cleansing, semantic search, analytics, data classification, root cause analysis, recommendations systems. main contribution paper introduction novel system - semantic knowledge graph - able dynamically discover score interesting relationships arbitrary combination entities (words, phrases, extracted concepts) dynamically materializing nodes edges compact graphical representation built automatically corpus data representative knowledge domain.",4 "tight bounds proper equivalence query learning dnf. prove new structural lemma partial boolean functions $f$, call seed lemma dnf. using lemma, give first subexponential algorithm proper learning dnf angluin's equivalence query (eq) model. algorithm time query complexity $2^{(\tilde{o}{\sqrt{n}})}$, optimal. also give new result certificates dnf-size, simple algorithm properly pac-learning dnf, new results eq-learning $\log n$-term dnf decision trees.",4 "semantic segmentation reverse attention. recent development fully convolutional neural network enables efficient end-to-end learning semantic segmentation. traditionally, convolutional classifiers taught learn representative semantic features labeled semantic objects. work, propose reverse attention network (ran) architecture trains network capture opposite concept (i.e., associated target class) well. ran three-branch network performs direct, reverse reverse-attention learning processes simultaneously. extensive experiments conducted show effectiveness ran semantic segmentation. built upon deeplabv2-largefov, ran achieves state-of-the-art miou score (48.1%) challenging pascal-context dataset. significant performance improvements also observed pascal-voc, person-part, nyudv2 ade20k datasets.",4 "robust low-rank tensor recovery: models algorithms. robust tensor recovery plays instrumental role robustifying tensor decompositions multilinear data analysis outliers, gross corruptions missing values diverse array applications. paper, study problem robust low-rank tensor recovery convex optimization framework, drawing upon recent advances robust principal component analysis tensor completion. propose tailored optimization algorithms global convergence guarantees solving constrained lagrangian formulations problem. algorithms based highly efficient alternating direction augmented lagrangian accelerated proximal gradient methods. also propose nonconvex model often improve recovery results convex models. investigate empirical recoverability properties convex nonconvex formulations compare computational performance algorithms simulated data. demonstrate number real applications practical effectiveness convex optimization framework robust low-rank tensor recovery.",19 "protein secondary structure prediction long short term memory networks. prediction protein secondary structure amino acid sequence classical bioinformatics problem. common methods use feed forward neural networks svms combined sliding window, models naturally handle sequential data. recurrent neural networks generalization feed forward neural network naturally handle sequential data. use bidirectional recurrent neural network long short term memory cells prediction secondary structure evaluate using cb513 dataset. secondary structure 8-class problem report better performance (0.674) state art (0.664). model includes feed forward networks long short term memory cells, path explored.",16 "w2vlda: almost unsupervised system aspect based sentiment analysis. increase online customer opinions specialised websites social networks, necessity automatic systems help organise classify customer reviews domain-specific aspect/categories sentiment polarity important ever. supervised approaches aspect based sentiment analysis obtain good results domain/language trained on, manually labelled data training supervised systems domains languages usually costly time consuming. work describe w2vlda, almost unsupervised system based topic modelling, combined unsupervised methods minimal configuration, performs aspect/category classifiation, aspect-terms/opinion-words separation sentiment polarity classification given domain language. evaluate performance aspect sentiment classification multilingual semeval 2016 task 5 (absa) dataset. show competitive results several languages (english, spanish, french dutch) domains (hotels, restaurants, electronic-devices).",4 "uniform transformation non-separable probability distributions. theoretical framework developed describe transformation distributes probability density functions uniformly space. one dimension, cumulative distribution used, generalize higher dimensions, non-separable distributions. potential function shown link probability density functions transformation, generalize cumulative. numerical method developed compute potential, examples shown two dimensions.",4 machine learning etudes astrophysics: selection functions mock cluster catalogs. making mock simulated catalogs important component astrophysical data analysis. selection criteria observed astronomical objects often complicated derived first principles. however existence observed group objects well-suited problem machine learning classification. paper use one-class classifiers learn properties observed catalog clusters galaxies rosat pick clusters mock simulations resemble observed rosat catalog. show method used study cross-correlations thermal sunya'ev-zeldovich signals number density maps x-ray selected cluster catalogs. method reduces bias due hand-tuning selection function readily scalable large catalogs high-dimensional space astrophysical features.,1 "multimodal emotion recognition using multimodal deep learning. enhance performance affective models reduce cost acquiring physiological signals real-world applications, adopt multimodal deep learning approach construct affective models multiple physiological signals. unimodal enhancement task, indicate best recognition accuracy 82.11% seed dataset achieved shared representations generated deep autoencoder (dae) model. multimodal facilitation tasks, demonstrate bimodal deep autoencoder (bdae) achieves mean accuracies 91.01% 83.25% seed deap datasets, respectively, much superior state-of-the-art approaches. cross-modal learning task, experimental results demonstrate mean accuracy 66.34% achieved seed dataset shared representations generated eeg-based dae training samples shared representations generated eye-based dae testing sample, vice versa.",4 "implicit gradient neural networks positive-definite mass matrix online linear equations solving. motivated advantages achieved implicit analogue net solving online linear equations, novel implicit neural model designed based conventional explicit gradient neural networks letter introducing positive-definite mass matrix. addition taking advantages implicit neural dynamics, proposed implicit gradient neural networks still achieve globally exponential convergence unique theoretical solution linear equations also global stability even no-solution multi-solution situations. simulative results verify theoretical convergence analysis proposed neural dynamics.",4 "tail bounds volume sampled linear regression. $n \times d$ design matrix linear regression problem given, response point hidden unless explicitly requested. goal observe small number $k \ll n$ responses, produce weight vector whose sum square loss points $1+\epsilon$ times minimum. standard approach problem use i.i.d. leverage score sampling, approach known perform poorly $k$ small (e.g., $k = d$); cases, dominated volume sampling, joint sampling method explicitly promotes diversity. methods compare larger $k$ previously understood. prove volume sampling poor behavior large $k$ - indeed worse leverage score sampling. also show repair volume sampling using new padding technique. prove padded volume sampling least good tail bound leverage score sampling: sample size $k=o(d\log + d/\epsilon)$ suffices guarantee total loss $1+\epsilon$ times minimum high probability. main technical challenge proving tail bounds sums dependent random matrices arise volume sampling.",4 "solving multiclass learning problems via error-correcting output codes. multiclass learning problems involve finding definition unknown function f(x) whose range discrete set containing k > 2 values (i.e., k ``classes''). definition acquired studying collections training examples form [x_i, f (x_i)]. existing approaches multiclass learning problems include direct application multiclass algorithms decision-tree algorithms c4.5 cart, application binary concept learning algorithms learn individual binary functions k classes, application binary concept learning algorithms distributed output representations. paper compares three approaches new technique error-correcting codes employed distributed output representation. show output representations improve generalization performance c4.5 backpropagation wide range multiclass learning tasks. also demonstrate approach robust respect changes size training sample, assignment distributed representations particular classes, application overfitting avoidance techniques decision-tree pruning. finally, show that---like methods---the error-correcting code technique provide reliable class probability estimates. taken together, results demonstrate error-correcting output codes provide general-purpose method improving performance inductive learning programs multiclass problems.",4 "orb-slam: versatile accurate monocular slam system. paper presents orb-slam, feature-based monocular slam system operates real time, small large, indoor outdoor environments. system robust severe motion clutter, allows wide baseline loop closing relocalization, includes full automatic initialization. building excellent algorithms recent years, designed scratch novel system uses features slam tasks: tracking, mapping, relocalization, loop closing. survival fittest strategy selects points keyframes reconstruction leads excellent robustness generates compact trackable map grows scene content changes, allowing lifelong operation. present exhaustive evaluation 27 sequences popular datasets. orb-slam achieves unprecedented performance respect state-of-the-art monocular slam approaches. benefit community, make source code public.",4 "hierarchical deep temporal model group activity recognition. group activity recognition, temporal dynamics whole activity inferred based dynamics individual people representing activity. build deep model capture dynamics based lstm (long-short term memory) models. make use ob- servations, present 2-stage deep temporal model group activity recognition problem. model, lstm model designed represent action dynamics in- dividual people sequence another lstm model designed aggregate human-level information whole activity understanding. evaluate model two datasets: collective activity dataset new volley- ball dataset. experimental results demonstrate proposed model improves group activity recognition perfor- mance compared baseline methods.",4 "self-taught artificial agent multi-physics computational model personalization. personalization process fitting model patient data, critical step towards application multi-physics computational models clinical practice. designing robust personalization algorithms often tedious, time-consuming, model- data-specific process. propose use artificial intelligence concepts learn task, inspired human experts manually perform it. problem reformulated terms reinforcement learning. off-line phase, vito, self-taught artificial agent, learns representative decision process model exploration computational model: learns model behaves change parameters. agent automatically learns optimal strategy on-line personalization. algorithm model-independent; applying new model requires adjusting hyper-parameters agent defining observations match. full knowledge model required. vito tested synthetic scenario, showing could learn optimize cost functions generically. vito applied inverse problem cardiac electrophysiology personalization whole-body circulation model. obtained results suggested vito could achieve equivalent, better goodness fit standard methods, robust (up 11% higher success rates) faster (up seven times) convergence rate. artificial intelligence approach could thus make personalization algorithms generalizable self-adaptable patient model.",4 "logical interpretation dempster-shafer theory, application visual recognition. formulate dempster shafer belief functions terms propositional logic using implicit notion provability underlying dempster shafer theory. given set propositional clauses, assigning weights certain propositional literals enables belief functions explicitly computed using network reliability techniques. also, logical procedure corresponding updating belief functions using dempster's rule combination shown. analysis formalizes implementation belief functions within assumption-based truth maintenance system (atms). describe extension atms-based visual recognition system, victors, logical formulation dempster shafer theory. without dempster shafer theory, victors computes possible visual interpretations (i.e. logical models) without determining best interpretation(s). incorporating dempster shafer theory enables optimal visual interpretations computed logical semantics maintained.",4 "streaming small-footprint keyword spotting using sequence-to-sequence models. develop streaming keyword spotting systems using recurrent neural network transducer (rnn-t) model: all-neural, end-to-end trained, sequence-to-sequence model jointly learns acoustic language model components. models trained predict either phonemes graphemes subword units, thus allowing us detect arbitrary keyword phrases, without out-of-vocabulary words. order adapt models requirements keyword spotting, propose novel technique biases rnn-t system towards specific keyword interest. systems compared strong sequence-trained, connectionist temporal classification (ctc) based ""keyword-filler"" baseline, augmented separate phoneme language model. overall, rnn-t system proposed biasing technique significantly improves performance baseline system.",4 "denosing using wavelets projections onto l1-ball. wavelet denoising denosing methods using concept sparsity based soft-thresholding. sparsity based denoising methods, assumed original signal sparse transform domains wavelet domain wavelet subsignals noisy signal projected onto l1-balls reduce noise. lecture note, shown size l1-ball equivalently soft threshold value determined using linear algebra. key step orthogonal projection onto epigraph set l1-norm cost function.",12 "translation quality estimation using recurrent neural network. paper describes submission shared task word/phrase level quality estimation (qe) first conference statistical machine translation (wmt16). objective shared task predict given word/phrase correct/incorrect (ok/bad) translation given sentence. paper, propose novel approach word level quality estimation using recurrent neural network language model (rnn-lm) architecture. rnn-lms found effective different natural language processing (nlp) applications. rnn-lm mainly used vector space language modeling different nlp problems. task, modify architecture rnn-lm. modified system predicts label (ok/bad) slot rather predicting word. input system word sequence, similar standard rnn-lm. approach language independent requires translated text qe. estimate phrase level quality, use output word level qe system.",4 "anchors hierachy: using triangle inequality survive high dimensional data. paper metric data structures high-dimensional non-euclidean space permit cached sufficient statistics accelerations learning algorithms. recently shown less 10 dimensions, decorating kd-trees additional ""cached sufficient statistics"" first second moments contingency tables provide satisfying acceleration wide range statistical learning tasks kernel regression, locally weighted regression, k-means clustering, mixture modeling bayes net learning. paper, begin defining anchors hierarchy - fast data structure algorithm localizing data based triangle-inequality-obeying distance metric. show this, right, gives fast effective clustering data. importantly show produce well-balanced structure similar ball-tree (omohundro, 1991) kind metric tree (uhlmann, 1991; ciaccia, patella, & zezula, 1997) way neither ""top-down"" ""bottom-up"" instead ""middle-out"". show structure, decorated cached sufficient statistics, allows wide variety statistical learning algorithms accelerated even thousands dimensions.",4 "graph-theoretic spatiotemporal context modeling video saliency detection. important challenging problem computer vision, video saliency detection typically cast spatiotemporal context modeling problem consecutive frames. result, key issue video saliency detection effectively capture intrinsical properties atomic video structures well associated contextual interactions along spatial temporal dimensions. motivated observation, propose graph-theoretic video saliency detection approach based adaptive video structure discovery, carried within spatiotemporal atomic graph. graph-based manifold propagation, proposed approach capable effectively modeling semantically contextual interactions among atomic video structures saliency detection preserving spatial smoothness temporal consistency. experiments demonstrate effectiveness proposed approach several benchmark datasets.",4 "entity retrieval text mining online reputation monitoring. online reputation monitoring (orm) concerned use computational tools measure reputation entities online, politicians companies. practice, current orm methods constrained generation data analytics reports, aggregate statistics popularity sentiment social media. argue format restrictive end users often like flexibility search entity-centric information available predefined charts. such, propose inclusion entity retrieval capabilities first step towards extension current orm capabilities. however, entity's reputation also influenced entity's relationships entities. therefore, address problem entity-relationship (e-r) retrieval goal search multiple connected entities. challenging problem traditional entity search systems cannot cope with. besides e-r retrieval also believe orm would benefit text-based entity-centric prediction capabilities, predicting entity popularity social media based news events outcome political surveys. however, none tasks provide useful results effective entity disambiguation sentiment analysis tailored context orm. consequently, thesis address two computational problems online reputation monitoring: entity retrieval text mining. researched developed methods extract, retrieve predict entity-centric information spread across web.",4 "clustering markov decision processes continual transfer. present algorithms effectively represent set markov decision processes (mdps), whose optimal policies already learned, smaller source subset lifelong, policy-reuse-based transfer learning reinforcement learning. necessary number previous tasks large cost measuring similarity counteracts benefit transfer. source subset forms `$\epsilon$-net' original set mdps, sense previous mdp $m_p$, source $m^s$ whose optimal policy $<\epsilon$ regret $m_p$. contributions follows. present exp-3-transfer, principled policy-reuse algorithm optimally reuses given source policy set learning new mdp. present framework cluster previous mdps extract source subset. framework consists (i) distance $d_v$ mdps measure policy-based similarity mdps; (ii) cost function $g(\cdot)$ uses $d_v$ measure good particular clustering generating useful source tasks exp-3-transfer (iii) provably convergent algorithm, mhav, finding optimal clustering. validate algorithms experiments surveillance domain.",4 "borrowing treasures wealthy: deep transfer learning selective joint fine-tuning. deep neural networks require large amount labeled training data supervised learning. however, collecting labeling much data might infeasible many cases. paper, introduce source-target selective joint fine-tuning scheme improving performance deep learning tasks insufficient training data. scheme, target learning task insufficient training data carried simultaneously another source learning task abundant training data. however, source learning task use existing training data. core idea identify use subset training images original source learning task whose low-level characteristics similar target learning task, jointly fine-tune shared convolutional layers tasks. specifically, compute descriptors linear nonlinear filter bank responses training images tasks, use descriptors search desired subset training samples source learning task. experiments demonstrate selective joint fine-tuning scheme achieves state-of-the-art performance multiple visual classification tasks insufficient training data deep learning. tasks include caltech 256, mit indoor 67, oxford flowers 102 stanford dogs 120. comparison fine-tuning without source domain, proposed method improve classification accuracy 2% - 10% using single model.",4 "ontological multidimensional data models contextual data qality. data quality assessment data cleaning context-dependent activities. motivated observation, propose ontological multidimensional data model (omd model), used model represent contexts logic-based ontologies. data assessment mapped context, additional analysis, processing, quality data extraction. resulting contexts allow representation dimensions, multidimensional data quality assessment becomes possible. core multidimensional context include generalized multidimensional data model datalog+/- ontology provably good properties terms query answering. main components used represent dimension hierarchies, dimensional constraints, dimensional rules, define predicates quality data specification. query answering relies upon triggers navigation dimension hierarchies, becomes basic tool extraction quality data. omd model interesting per se, beyond applications data quality. allows logic-based, computationally tractable representation multidimensional data, extending previous multidimensional data models additional expressive power functionalities.",4 learning spanish dialects twitter. paper maps large-scale variation spanish language employing corpus based geographically tagged twitter messages. lexical dialects extracted analysis variants tens concepts. resulting maps show linguistic variation unprecedented scale across globe. discuss properties main dialects within machine learning approach find varieties spoken urban areas international character contrast country areas dialects show regional uniformity.,19 "graph transformation planning via abstraction. modern software systems increasingly incorporate self-* behavior adapt changes environment runtime. adaptations often involve reconfiguring software architecture system. many systems also need manage architecture themselves, i.e., need planning component autonomously decide reconfigurations execute reach desired target configuration. specification reconfigurations, employ graph transformations systems (gts) due close relation graphs uml object diagrams. solve resulting planning problems planning system works directly gts. features domain-independent heuristic uses solution length abstraction original problem estimate. finally, provide experimental results two different domains, confirm heuristic performs better another domain-independent heuristic resembles heuristics employed related work.",4 "scaling laws human speech, decreasing emergence new words generalized model. human language, typical complex system, organization evolution attractive topic physical cultural researchers. paper, present first exhaustive analysis text organization human speech. two important results that: (i) construction organization spoken language characterized zipf's law heaps' law, observed written texts; (ii) word frequency vs. rank distribution growth distinct words increase text length shows significant differences book speech. speech word frequency distribution concentrated higher frequency words, emergence new words decreases much rapidly content length grows. based observations, new generalized model proposed explain complex dynamical behaviors differences speech book.",4 "backprop functor: compositional perspective supervised learning. supervised learning algorithm searches set functions $a \to b$ parametrised space $p$ find best approximation ideal function $f\colon \to b$. taking examples $(a,f(a)) \in a\times b$, updating parameter according rule. define category update rules may composed, show gradient descent---with respect fixed step size error function satisfying certain property---defines monoidal functor category parametrised functions category update rules. provides structural perspective backpropagation, well broad generalisation neural networks.",12 "dual fast slow feature interaction biologically inspired visual recognition human action. computational neuroscience studies examined human visual system functional magnetic resonance imaging (fmri) identified model mammalian brain pursues two distinct pathways (for recognition biological movement tasks). brain, dorsal stream analyzes information motion (optical flow), fast features, ventral stream (form pathway) analyzes form information (through active basis model based incremental slow feature analysis ) slow features. proposed approach suggests motion perception human visual system composes fast slow feature interactions identifies biological movements. form features visual system biologically follows application active basis model incremental slow feature analysis extraction slowest form features human objects movements ventral stream. applying incremental slow feature analysis provides opportunity use action prototypes. extract slowest features episodic observation required fast features updates processing motion information every frames. experimental results shown promising accuracy proposed model good performance two datasets (kth weizmann).",4 "training neural networks using power linear units (polus). paper, introduce ""power linear unit"" (polu) increases nonlinearity capacity neural network thus helps improving performance. polu adopts several advantages previously proposed activation functions. first, output polu positive inputs designed identity avoid gradient vanishing problem. second, polu non-zero output negative inputs output mean units close zero, hence reducing bias shift effect. thirdly, saturation negative part polu, makes noise-robust negative inputs. furthermore, prove polu able map portions every layer's input space using power function thus increases number response regions neural network. use image classification comparing proposed activation function others. experiments, mnist, cifar-10, cifar-100, street view house numbers (svhn) imagenet used benchmark datasets. neural networks implemented include widely-used elu-network, resnet-50, vgg16, plus couple shallow networks. experimental results show proposed activation function outperforms state-of-the-art models networks.",4 "deep neural networks rival representation primate cortex core visual object recognition. primate visual system achieves remarkable visual object recognition performance even brief presentations changes object exemplar, geometric transformations, background variation (a.k.a. core visual object recognition). remarkable performance mediated representation formed inferior temporal (it) cortex. parallel, recent advances machine learning led ever higher performing models object recognition using artificial deep neural networks (dnns). remains unclear, however, whether representational performance dnns rivals brain. accurately produce comparison, major difficulty unifying metric accounts experimental limitations amount noise, number neural recording sites, number trials, computational limitations complexity decoding classifier number classifier training examples. work perform direct comparison corrects experimental limitations computational considerations. part methodology, propose extension ""kernel analysis"" measures generalization accuracy function representational complexity. evaluations show that, unlike previous bio-inspired models, latest dnns rival representational performance cortex visual object recognition task. furthermore, show models perform well measures representational performance also perform well measures representational similarity measures predicting individual multi-unit responses. whether dnns rely computational mechanisms similar primate visual system yet determined, but, unlike previous bio-inspired models, possibility cannot ruled merely representational performance grounds.",16 "spelling error trends patterns sindhi. statistical error correction technique accurate widely used approach today, language like sindhi low resourced language trained corpora's available, statistical techniques possible all. instead useful alternative would exploit various spelling error trends sindhi using rule based approach. designing technique essential prerequisite would study various error patterns language. pa per presents various studies spelling error trends types sindhi language. research shows error trends common languages also encountered sindhi exist error patters catered specifically sindhi language.",4 "enhanced optimization composite objectives novelty selection. important benefit multi-objective search maintains diverse population candidates, helps deceptive problems particular. diversity useful, however: candidates optimize one objective ignoring others rarely helpful. paper proposes solution: original objectives replaced linear combinations, thus focusing search useful tradeoffs objectives. compensate loss diversity, transformation accompanied selection mechanism favors novelty. highly deceptive problem discovering minimal sorting networks, approach finds better solutions, finds faster consistently standard methods. therefore promising approach solving deceptive problems multi-objective optimization.",4 "factorized asymptotic bayesian inference factorial hidden markov models. factorial hidden markov models (fhmms) powerful tools modeling sequential data. learning fhmms yields challenging simultaneous model selection issue, i.e., selecting number multiple markov chains dimensionality chain. main contribution address model selection issue extending factorized asymptotic bayesian (fab) inference fhmms. first, offer better approximation marginal log-likelihood previous fab inference. key idea integrate transition probabilities, yet still apply laplace approximation emission probabilities. second, prove two similar hidden states fhmm, i.e. one redundant, fab almost surely shrink eliminate one them, making model parsimonious. experimental results show fab fhmms significantly outperforms state-of-the-art nonparametric bayesian ifhmm variational fhmm model selection accuracy, competitive held-out perplexity.",19 "new approach dynamic traveling salesman problem: hybrid ant colony optimization descending gradient. nowadays swarm intelligence-based algorithms used widely optimize dynamic traveling salesman problem (dtsp). paper, used mixed method ant colony optimization (aoc)and gradient descent optimize dtsp differs aco algorithm evaporation rate innovative data. approach prevents premature convergence scape local optimum spots also makes possible find better solutions algorithm. paper, going offer gradient descent aco algorithm comparison former methods shows algorithm significantly improved routes optimization.",4 "variable elimination fourier domain. ability represent complex high dimensional probability distributions compact form one key insights field graphical models. factored representations ubiquitous machine learning lead major computational advantages. explore different type compact representation based discrete fourier representations, complementing classical approach based conditional independencies. show large class probabilistic graphical models compact fourier representation. theoretical result opens entirely new way approximating probability distribution. demonstrate significance approach applying variable elimination algorithm. compared traditional bucket representation approximate inference algorithms, obtain significant improvements.",4 "action tubelet detector spatio-temporal action localization. current state-of-the-art approaches spatio-temporal action localization rely detections frame level linked tracked across time. paper, leverage temporal continuity videos instead operating frame level. propose action tubelet detector (act-detector) takes input sequence frames outputs tubelets, i.e., sequences bounding boxes associated scores. way state-of-the-art object detectors rely anchor boxes, act-detector based anchor cuboids. build upon ssd framework. convolutional features extracted frame, scores regressions based temporal stacking features, thus exploiting information sequence. experimental results show leveraging sequences frames significantly improves detection performance using individual frames. gain tubelet detector explained accurate scores precise localization. act-detector outperforms state-of-the-art methods frame-map video-map j-hmdb ucf-101 datasets, particular high overlap thresholds.",4 "vote3deep: fast object detection 3d point clouds using efficient convolutional neural networks. paper proposes computationally efficient approach detecting objects natively 3d point clouds using convolutional neural networks (cnns). particular, achieved leveraging feature-centric voting scheme implement novel convolutional layers explicitly exploit sparsity encountered input. end, examine trade-off accuracy speed different architectures additionally propose use l1 penalty filter activations encourage sparsity intermediate representations. best knowledge, first work propose sparse convolutional layers l1 regularisation efficient large-scale processing 3d data. demonstrate efficacy approach kitti object detection benchmark show vote3deep models three layers outperform previous state art laser laser-vision based approaches margins 40% remaining highly competitive terms processing time.",4 "study bias offline evaluation recommendation algorithm. recommendation systems integrated majority large online systems filter rank information according user profiles. thus influences way users interact system and, consequence, bias evaluation performance recommendation algorithm computed using historical data (via offline evaluation). paper describes bias discuss relevance weighted offline evaluation reduce bias different classes recommendation algorithms.",4 "beyond gaussian pyramid: multi-skip feature stacking action recognition. state-of-the-art action feature extractors involve differential operators, act highpass filters tend attenuate low frequency action information. attenuation introduces bias resulting features generates ill-conditioned feature matrices. gaussian pyramid used feature enhancing technique encodes scale-invariant characteristics feature space attempt deal attenuation. however, core gaussian pyramid convolutional smoothing operation, makes incapable generating new features coarse scales. order address problem, propose novel feature enhancing technique called multi-skip feature stacking (mifs), stacks features extracted using family differential filters parameterized multiple time skips encodes shift-invariance frequency space. mifs compensates information lost using differential operators recapturing information coarse scales. recaptured information allows us match actions different speeds ranges motion. prove mifs enhances learnability differential-based features exponentially. resulting feature matrices mifs much smaller conditional numbers variances conventional methods. experimental results show significantly improved performance challenging action recognition event detection tasks. specifically, method exceeds state-of-the-arts hollywood2, ucf101 ucf50 datasets comparable state-of-the-arts hmdb51 olympics sports datasets. mifs also used speedup strategy feature extraction minimal accuracy cost.",4 "robust subgraph generation improves abstract meaning representation parsing. abstract meaning representation (amr) representation open-domain rich semantics, potential use fields like event extraction machine translation. node generation, typically done using simple dictionary lookup, currently important limiting factor amr parsing. propose small set actions derive amr subgraphs transformations spans text, allows robust learning stage. set construction actions generalize better previous approach, learned simple classifier. improve previous state-of-the-art result amr parsing, boosting end-to-end performance 3 f$_1$ ldc2013e117 ldc2014t12 datasets.",4 "combining independent modules solve multiple-choice synonym analogy problems. existing statistical approaches natural language problems coarse approximations true complexity language processing. such, single technique best problem instances. many researchers examining ensemble methods combine output successful, separately developed modules create accurate solutions. paper examines three merging rules combining probability distributions: well known mixture rule, logarithmic rule, novel product rule. rules applied state-of-the-art results two problems commonly used assess human mastery lexical semantics -- synonym questions analogy questions. three merging rules result ensembles accurate component modules. differences among three rules statistically significant, suggestive popular mixture rule best rule either two problems.",4 "improving deep pancreas segmentation ct mri images via recurrent neural contextual learning direct loss function. deep neural networks demonstrated promising performance accurate segmentation challenging organs (e.g., pancreas) abdominal ct mri scans. current deep learning approaches conduct pancreas segmentation processing sequences 2d image slices independently deep, dense per-pixel masking image, without explicitly enforcing spatial consistency constraint segmentation successive slices. propose new convolutional/recurrent neural network architecture address contextual learning segmentation consistency problem. deep convolutional sub-network first designed pre-trained scratch. output layer network module connected recurrent layers fine-tuned contextual learning, end-to-end manner. recurrent sub-network type long short-term memory (lstm) network performs segmentation image integrating neighboring slice segmentation predictions, form dependent sequence processing. additionally, novel segmentation-direct loss function (named jaccard loss) proposed deep networks trained optimize jaccard index (ji) directly. extensive experiments conducted validate proposed deep models, quantitative pancreas segmentation using ct mri scans. method outperforms state-of-the-art work ct [11] mri pancreas segmentation [1], respectively.",4 "optimisation crossdocking distribution centre simulation model. paper reports continuing research modelling order picking process within crossdocking distribution centre using simulation optimisation. aim project optimise discrete event simulation model understand factors affect finding optimal performance. initial investigation revealed precision selected simulation output performance measure number replications required evaluation optimisation objective function simulation influences ability optimisation technique. experimented common random numbers, order improve precision simulation output performance measure, intended use number replications utilised purpose initial number replications optimisation crossdocking distribution centre simulation model. results demonstrate improve precision selected simulation output performance measure value using common random numbers various levels replications. furthermore, optimising crossdocking distribution centre simulation model, able achieve optimal performance using fewer simulations runs simulation model uses common random numbers compared simulation model use common random numbers.",4 "viewpoint: artificial intelligence labour. welfare modern societies intrinsically linked wage labour. exceptions, modern human sell labour-power able reproduce biologically socially. thus, lingering fear technological unemployment features predominately theme among artificial intelligence researchers. short paper show that, past trends anything go by, fear irrational. contrary, argue main problem humanity facing normalisation extremely long working hours.",4 "peyma: tagged corpus persian named entities. goal ner task classify proper nouns text classes person, location, organization. important preprocessing step many nlp tasks question-answering summarization. although many research studies conducted area english state-of-the-art ner systems reached performances higher 90 percent terms f1 measure, research studies task persian. one main important causes may lack standard persian ner dataset train test ner systems. research create standard, big-enough tagged persian ner dataset distributed free research purposes. order construct standard dataset, studied standard ner datasets constructed english researches found almost datasets constructed using news texts. collected documents ten news websites. later, order provide annotators guidelines tag documents, studying guidelines used constructing conll muc standard english datasets, set guidelines considering persian linguistic rules.",4 "compatible discourse annotations? insights mapping rst-dt pdtb annotations. discourse-annotated corpora important resource community, often annotated according different frameworks. makes comparison annotations difficult, thereby also preventing researchers searching corpora unified way, using annotated data jointly train computational systems. several theoretical proposals recently made mapping relational labels different frameworks other, proposals far validated existing annotations. two largest discourse relation annotated resources, penn discourse treebank rhetorical structure theory discourse treebank, however annotated text, allowing direct comparison annotation layers. propose method automatically aligning discourse segments, evaluate existing mapping proposals comparing empirically observed proposed mappings. analysis highlights influence segmentation subsequent discourse relation labeling, shows agreement frameworks reasonable explicit relations, agreement implicit relations low. identify several sources systematic discrepancies two annotation schemes discuss consequences discrepancies future annotation training automatic discourse relation labellers.",4 "learning ordered representations nested dropout. paper, study ordered representations data different dimensions different degrees importance. learn representations introduce nested dropout, procedure stochastically removing coherent nested sets hidden units neural network. first present sequence theoretical results simple case semi-linear autoencoder. rigorously show application nested dropout enforces identifiability units, leads exact equivalence pca. extend algorithm deep models demonstrate relevance ordered representations number applications. specifically, use ordered property learned codes construct hash-based data structures permit fast retrieval, achieving retrieval time logarithmic database size independent dimensionality representation. allows codes hundreds times longer currently feasible retrieval. therefore avoid diminished quality associated short codes, still performing retrieval competitive speed existing methods. also show ordered representations promising way learn adaptive compression efficient online data reconstruction.",19 "slam based quasi dense reconstruction minimally invasive surgery scenes. recovering surgical scene structure laparoscope surgery crucial step surgical guidance augmented reality applications. paper, quasi dense reconstruction algorithm surgical scene proposed. based state-of-the-art slam system, exploiting initial exploration phase typically performed surgeon beginning surgery. show convert sparse slam map quasi dense scene reconstruction, using pairs keyframe images correlation-based featureless patch matching. validated approach live porcine experiment using computed tomography ground truth, yielding root mean squared error 4.9mm.",4 "local binary convolutional neural networks. propose local binary convolution (lbc), efficient alternative convolutional layers standard convolutional neural networks (cnn). design principles lbc motivated local binary patterns (lbp). lbc layer comprises set fixed sparse pre-defined binary convolutional filters updated training process, non-linear activation function set learnable linear weights. linear weights combine activated filter responses approximate corresponding activated filter responses standard convolutional layer. lbc layer affords significant parameter savings, 9x 169x number learnable parameters compared standard convolutional layer. furthermore, sparse binary nature weights also results 9x 169x savings model size compared standard convolutional layer. demonstrate theoretically experimentally local binary convolution layer good approximation standard convolutional layer. empirically, cnns lbc layers, called local binary convolutional neural networks (lbcnn), achieves performance parity regular cnns range visual datasets (mnist, svhn, cifar-10, imagenet) enjoying significant computational savings.",4 "bayesian incremental learning deep neural networks. industrial machine learning pipelines, data often arrive parts. particularly case deep neural networks, may expensive train model scratch time, one would rather use previously learned model new data improve performance. however, deep neural networks prone getting stuck suboptimal solution trained new data compared full dataset. work focuses continuous learning setup task always new parts data arrive sequentially. apply bayesian approach update posterior approximation new piece data find method outperform traditional approach experiments.",19 "prism: person re-identification via structured matching. person re-identification (re-id), emerging problem visual surveillance, deals maintaining entities individuals whilst traverse various locations surveilled camera network. visual perspective re-id challenging due significant changes visual appearance individuals cameras different pose, illumination calibration. globally challenge arises need maintain structurally consistent matches among individual entities across different camera views. propose prism, structured matching method jointly account challenges. view global problem weighted graph matching problem estimate edge weights learning predict based co-occurrences visual patterns training examples. co-occurrence based scores turn account appearance changes inferring likely unlikely visual co-occurrences appearing training instances. implement prism single shot multi-shot scenarios. prism uniformly outperforms state-of-the-art terms matching rate computationally efficient.",4 "sequential convolutional neural networks slot filling spoken language understanding. investigate usage convolutional neural networks (cnns) slot filling task spoken language understanding. propose novel cnn architecture sequence labeling takes account previous context words preserved order information pays special attention current word surrounding context. moreover, combines information past future words classification. proposed cnn architecture outperforms even previously best ensembling recurrent neural network model achieves state-of-the-art results f1-score 95.61% atis benchmark dataset without using additional linguistic knowledge resources.",4 "convolutional point-set representation: convolutional bridge densely annotated image 3d face alignment. present robust method estimating facial pose shape information densely annotated facial image. method relies convolutional point-set representation (cpr), carefully designed matrix representation summarize different layers information encoded set detected points annotated image. cpr disentangles dependencies shape different pose parameters enables updating different parameters sequential manner via convolutional neural networks recurrent layers. updating pose parameters, sample reprojection errors along predicted direction update parameters based pattern reprojection errors. technique boosts model's capability searching local minimum challenging scenarios. also demonstrate annotation different sources merged framework cpr contributes outperforming current state-of-the-art solutions 3d face alignment. experiments indicate proposed cprfa (cpr-based face alignment) significantly improves 3d alignment accuracy densely annotated image contains noise missing values, common ""in-the-wild"" acquisition scenarios.",4 "poisson convolution model characterizing topical content word frequency exclusivity. ongoing challenge analysis document collections summarize content terms set inferred themes interpreted substantively terms topics. current practice parametrizing themes terms frequent words limits interpretability ignoring differential use words across topics. argue words common exclusive theme effective characterizing topical content. consider setting professional editors annotated documents collection topic categories, organized tree, leaf-nodes correspond specific topics. document annotated multiple categories, different levels tree. introduce hierarchical poisson convolution model analyze annotated documents setting. model leverages structure among categories defined professional editors infer clear semantic description topic terms words frequent exclusive. carry large randomized experiment amazon turk demonstrate topic summaries based frex score interpretable currently established frequency based summaries, proposed model produces efficient estimates exclusivity currently models. also develop parallelized hamiltonian monte carlo sampler allows inference scale millions documents.",4 "compact convolutional neural networks classification asynchronous steady-state visual evoked potentials. steady-state visual evoked potentials (ssveps) neural oscillations parietal occipital regions brain evoked flickering visual stimuli. ssveps robust signals measurable electroencephalogram (eeg) commonly used brain-computer interfaces (bcis). however, methods high-accuracy decoding ssveps usually require hand-crafted approaches leverage domain-specific knowledge stimulus signals, specific temporal frequencies visual stimuli relative spatial arrangement. knowledge unavailable, ssvep signals acquired asynchronously, approaches tend fail. paper, show compact convolutional neural network (compact-cnn), requires raw eeg signals automatic feature extraction, used decode signals 12-class ssvep dataset without need domain-specific knowledge calibration data. report across subject mean accuracy approximately 80% (chance 8.3%) show substantially better current state-of-the-art hand-crafted approaches using canonical correlation analysis (cca) combined-cca. furthermore, analyze compact-cnn examine underlying feature representation, discovering deep learner extracts additional phase amplitude related features associated structure dataset. discuss compact-cnn shows promise bci applications allow users freely gaze/attend stimulus time (e.g., asynchronous bci) well provides method analyzing ssvep signals way might augment understanding basic processing visual cortex.",4 "inferring strategies sentence ordering multidocument news summarization. problem organizing information multidocument summarization generated summary coherent received relatively little attention. sentence ordering single document summarization determined ordering sentences input article, case multidocument summarization summary sentences may drawn different input articles. paper, propose methodology studying properties ordering information news genre describe experiments done corpus multiple acceptable orderings developed task. based experiments, implemented strategy ordering information combines constraints chronological order events topical relatedness. evaluation augmented algorithm shows significant improvement ordering two baseline strategies.",4 "rotation-sensitive regression oriented scene text detection. text natural images arbitrary orientations, requiring detection terms oriented bounding boxes. normally, multi-oriented text detector often involves two key tasks: 1) text presence detection, classification problem disregarding text orientation; 2) oriented bounding box regression, concerns text orientation. previous methods rely shared features tasks, resulting degraded performance due incompatibility two tasks. address issue, propose perform classification regression features different characteristics, extracted two network branches different designs. concretely, regression branch extracts rotation-sensitive features actively rotating convolutional filters, classification branch extracts rotation-invariant features pooling rotation-sensitive features. proposed method named rotation-sensitive regression detector (rrd) achieves state-of-the-art performance three oriented scene text benchmark datasets, including icdar 2015, msra-td500, rctw-17 coco-text. furthermore, rrd achieves significant improvement ship collection dataset, demonstrating generality oriented object detection.",4 "arc-swift: novel transition system dependency parsing. transition-based dependency parsers often need sequences local shift reduce operations produce certain attachments. correct individual decisions hence require global information sentence context mistakes cause error propagation. paper proposes novel transition system, arc-swift, enables direct attachments tokens farther apart single transition. allows parser leverage lexical information directly transition decisions. hence, arc-swift achieve significantly better performance small beam size. parsers reduce error 3.7--7.6% relative using existing transition systems penn treebank dependency parsing task english universal dependencies.",4 "sparse multi-output gaussian processes medical time series prediction. real-time monitoring hospital patients, high-quality inference patients' health status using information available clinical covariates lab tests essential enable successful medical interventions improve patient outcomes. work, develop explore bayesian nonparametric model based gaussian process (gp) regression hospital patient monitoring. method, medgp, incorporates 24 clinical lab covariates supports rich reference data set relationships observed covariates may inferred exploited high-quality inference patient state time. this, develop highly structured sparse gp kernel enable tractable computation tens thousands time points estimating correlations among clinical covariates, patients, periodicity high-dimensional time series measurements physiological signals. apply medgp data hundreds thousands patients treated hospital university pennsylvania. medgp number benefits current methods, including (i) requiring alignment time series data, (ii) quantifying confidence intervals predictions, (iii) exploiting vast rich database patients, (iv) providing interpretable relationships among clinical covariates. evaluate compare results medgp task online state prediction three different patient subgroups.",19 "fast nonsmooth regularized risk minimization continuation. regularized risk minimization, associated optimization problem becomes particularly difficult loss regularizer nonsmooth. existing approaches either slow unclear convergence properties, restricted limited problem subclasses, require careful setting smoothing parameter. paper, propose continuation algorithm applicable large class nonsmooth regularized risk minimization problems, flexibly used number existing solvers underlying smoothed subproblem, convergence results whole algorithm rather one subproblems. particular, accelerated solvers used, proposed algorithm achieves fastest known rates $o(1/t^2)$ strongly convex problems, $o(1/t)$ general convex problems. experiments nonsmooth classification regression tasks demonstrate proposed algorithm outperforms state-of-the-art.",4 "inverted residuals linear bottlenecks: mobile networks classification, detection segmentation. paper describe new mobile architecture, mobilenetv2, improves state art performance mobile models multiple tasks benchmarks well across spectrum different model sizes. also describe efficient ways applying mobile models object detection novel framework call ssdlite. additionally, demonstrate build mobile semantic segmentation models reduced form deeplabv3 call mobile deeplabv3. mobilenetv2 architecture based inverted residual structure input output residual block thin bottleneck layers opposite traditional residual models use expanded representations input mobilenetv2 uses lightweight depthwise convolutions filter features intermediate expansion layer. additionally, find important remove non-linearities narrow layers order maintain representational power. demonstrate improves performance provide intuition led design. finally, approach allows decoupling input/output domains expressiveness transformation, provides convenient framework analysis. measure performance imagenet classification, coco object detection, voc image segmentation. evaluate trade-offs accuracy, number operations measured multiply-adds (madd), well number parameters",4 "transfer learning deep features remote sensing poverty mapping. lack reliable data developing countries major obstacle sustainable development, food security, disaster relief. poverty data, example, typically scarce, sparse coverage, labor-intensive obtain. remote sensing data high-resolution satellite imagery, hand, becoming increasingly available inexpensive. unfortunately, data highly unstructured currently techniques exist automatically extract useful insights inform policy decisions help direct humanitarian efforts. propose novel machine learning approach extract large-scale socioeconomic indicators high-resolution satellite imagery. main challenge training data scarce, making difficult apply modern techniques convolutional neural networks (cnn). therefore propose transfer learning approach nighttime light intensities used data-rich proxy. train fully convolutional cnn model predict nighttime lights daytime imagery, simultaneously learning features useful poverty prediction. model learns filters identifying different terrains man-made structures, including roads, buildings, farmlands, without supervision beyond nighttime lights. demonstrate learned features highly informative poverty mapping, even approaching predictive performance survey data collected field.",4 "efficient convolutional neural network audio event detection. wireless distributed systems used sensor networks, internet-of-things cyber-physical systems, impose high requirements resource efficiency. advanced preprocessing classification data network edge help decrease communication demand reduce amount data processed centrally. area distributed acoustic sensing, combination algorithms high classification rate resource-constraint embedded systems essential. unfortunately, algorithms acoustic event detection high memory computational demand suited execution network edge. paper addresses aspects applying structural optimizations convolutional neural network audio event detection reduce memory requirement factor 500 computational effort factor 2.1 performing 9.2% better.",4 "denoising adversarial autoencoders. unsupervised learning growing interest unlocks potential held vast amounts unlabelled data learn useful representations inference. autoencoders, form generative model, may trained learning reconstruct unlabelled input data latent representation space. robust representations may produced autoencoder learns recover clean input samples corrupted ones. representations may improved introducing regularisation training shape distribution encoded data latent space. suggest denoising adversarial autoencoders, combine denoising regularisation, shaping distribution latent space using adversarial training. introduce novel analysis shows denoising may incorporated training sampling adversarial autoencoders. experiments performed assess contributions denoising makes learning representations classification sample synthesis. results suggest autoencoders trained using denoising criterion achieve higher classification performance, synthesise samples consistent input data trained without corruption process.",4 "supervised learning universal sentence representations natural language inference data. many modern nlp systems rely word embeddings, previously trained unsupervised manner large corpora, base features. efforts obtain embeddings larger chunks text, sentences, however successful. several attempts learning unsupervised representations sentences reached satisfactory enough performance widely adopted. paper, show universal sentence representations trained using supervised data stanford natural language inference datasets consistently outperform unsupervised methods like skipthought vectors wide range transfer tasks. much like computer vision uses imagenet obtain features, transferred tasks, work tends indicate suitability natural language inference transfer learning nlp tasks. encoder publicly available.",4 "sparse dictionary-based attributes action recognition summarization. present approach dictionary learning action attributes via information maximization. unify class distribution appearance information objective function learning sparse dictionary action attributes. objective function maximizes mutual information learned remains learned terms appearance information class distribution dictionary atom. propose gaussian process (gp) model sparse representation optimize dictionary objective function. sparse coding property allows kernel compact support gp realize efficient dictionary learning process. hence describe action video set compact discriminative action attributes. importantly, recognize modeled action categories sparse feature space, generalized unseen unmodeled action categories. experimental results demonstrate effectiveness approach action recognition summarization.",4 "model inductive bias learning. major problem machine learning inductive bias: choose learner's hypothesis space large enough contain solution problem learnt, yet small enough ensure reliable generalization reasonably-sized training sets. typically bias supplied hand skill insights experts. paper model automatically learning bias investigated. central assumption model learner embedded within environment related learning tasks. within environment learner sample multiple tasks, hence search hypothesis space contains good solutions many problems environment. certain restrictions set hypothesis spaces available learner, show hypothesis space performs well sufficiently large number training tasks also perform well learning novel tasks environment. explicit bounds also derived demonstrating learning multiple tasks within environment related tasks potentially give much better generalization learning single task.",4 "analytic framework maritime situation analysis. maritime domain awareness critical protecting sea lanes, ports, harbors, offshore structures critical infrastructures common threats illegal activities. limited surveillance resources constrain maritime domain awareness compromise full security coverage times. situation calls innovative intelligent systems interactive situation analysis assist marine authorities security personal routine surveillance operations. article, propose novel situation analysis framework analyze marine traffic data differentiate various scenarios vessel engagement purpose detecting anomalies interest marine vessels operate period time relative proximity other. proposed framework views vessel behavior probabilistic processes uses machine learning model common vessel interaction patterns. represent patterns interest left-to-right hidden markov models classify patterns using support vector machines.",4 "phonetic based soundex & shapeex algorithm sindhi spell checker system. paper presents novel combinational phonetic algorithm sindhi language, used developing sindhi spell checker yet developed prior work. compound textual forms glyphs sindhi language presents substantial challenge developing sindhi spell checker system generating similar suggestion list misspelled words. order implement system, phonetic based sindhi language rules patterns must considered account increasing accuracy efficiency. proposed system developed blend phonetic based soundex algorithm shapeex algorithm pattern glyph matching, generating accurate efficient suggestion list incorrect misspelled sindhi words. table phonetically similar sounding sindhi characters soundex algorithm also generated along another table containing similar glyph shape based character groups shapeex algorithm. first ever attempt type categorization representation sindhi language.",4 "grouse incremental svd. grouse (grassmannian rank-one update subspace estimation) incremental algorithm identifying subspace rn sequence vectors subspace, subset components vector revealed iteration. recent analysis shown grouse converges locally expected linear rate, certain assumptions. grouse similar flavor incremental singular value decomposition algorithm, updates svd matrix following addition single column. paper, modify incremental svd approach handle missing data, demonstrate modified approach equivalent grouse, certain choice algorithmic parameter.",4 "probabilistic framework discriminative dictionary learning. paper, address problem discriminative dictionary learning (ddl), sparse linear representation classification combined probabilistic framework. such, single discriminative dictionary linear binary classifiers learned jointly. encoding sparse representation discriminative classification models map setting, propose general optimization framework allows data-driven tradeoff faithful representation accurate classification. opposed previous work, learning methodology capable incorporating diverse family classification cost functions (including used popular boosting methods), avoiding need involved optimization techniques. show ddl solved sequence updates make use well-known well-studied sparse coding dictionary learning algorithms literature. validate ddl framework, apply digit classification face recognition test standard benchmarks.",4 using sets probability measures represent uncertainty. explore use sets probability measures representation uncertainty.,4 "emoticonsciousness. temporal analysis emoticon use swedish, italian, german english asynchronous electronic communication reported. emoticons classified positive, negative neutral. postings newsgroups 66 week period considered. aggregate analysis emoticon use newsgroups science politics tend whole consistent entire time period. possible, events coincide divergences trends language-subject pairs noted. political discourse italian period shows marked use negative emoticons, swedish, positive emoticons.",4 "predicting enemy's actions improves commander decision-making. defense advanced research projects agency (darpa) real-time adversarial intelligence decision-making (raid) program investigating feasibility ""reading mind enemy"" - estimate anticipate, real-time, enemy's likely goals, deceptions, actions, movements positions. program focuses specifically urban battles echelons battalion below. raid program leverages approximate game-theoretic deception-sensitive algorithms provide real-time enemy estimates tactical commander. key hypothesis program predictions recommendations make commander effective, i.e. able achieve operational goals safer, faster, efficiently. realistic experimentation evaluation drive development process using human-in-the-loop wargames compare humans raid system. two experiments conducted 2005 part phase determine raid software could make predictions recommendations effectively accurately 4-person experienced staff. report discusses intriguing encouraging results first two experiments conducted raid program. also provides details experiment environment methodology used demonstrate prove research goals.",4 macro-economic time series modeling interaction networks. macro-economic models describe dynamics economic quantities. estimations forecasts produced models play substantial role financial political decisions. contribution describe approach based genetic programming symbolic regression identify variable interactions large datasets. proposed approach multiple symbolic regression runs executed variable dataset find potentially interesting models. result variable interaction network describes variables relevant approximation variable dataset. approach applied macro-economic dataset monthly observations important economic indicators order identify potentially interesting dependencies indicators. resulting interaction network macro-economic indicators briefly discussed two identified models presented detail. two models approximate help wanted index cpi inflation us.,4 "saga: submodular greedy algorithm group recommendation. paper, propose unified framework algorithm problem group recommendation fixed number items alternatives recommended group users. problem group recommendation arises naturally many real world contexts, closely related budgeted social choice problem studied economics. frame group recommendation problem choosing subgraph largest group consensus score completely connected graph defined item affinity matrix. propose fast greedy algorithm strong theoretical guarantees, show proposed algorithm compares favorably state-of-the-art group recommendation algorithms according commonly used relevance coverage performance measures benchmark dataset.",4 "temporally consistent motion segmentation rgb-d video. present method temporally consistent motion segmentation rgb-d videos assuming piecewise rigid motion model. formulate global energies entire rgb-d sequences terms segmentation frame number objects, rigid motion object sequence. develop novel initialization procedure clusters feature tracks obtained rgb data leveraging depth information. minimize energy using coordinate descent approach includes novel techniques assemble object motion hypotheses. main benefit approach enables us fuse consistently labeled object segments rgb-d frames input sequence individual 3d object reconstructions.",4 "compression neural machine translation models via pruning. neural machine translation (nmt), like many deep learning domains, typically suffers over-parameterization, resulting large storage sizes. paper examines three simple magnitude-based pruning schemes compress nmt models, namely class-blind, class-uniform, class-distribution, differ terms pruning thresholds computed different classes weights nmt architecture. demonstrate efficacy weight pruning compression technique state-of-the-art nmt system. show nmt model 200 million parameters pruned 40% little performance loss measured wmt'14 english-german translation task. sheds light distribution redundancy nmt architecture. main result retraining, recover even surpass original performance 80%-pruned model.",4 "hybrid approach extract keyphrases medical documents. keyphrases phrases, consisting one words, representing important concepts articles. keyphrases useful variety tasks text summarization, automatic indexing, clustering/classification, text mining etc. paper presents hybrid approach keyphrase extraction medical documents. keyphrase extraction approach presented paper amalgamation two methods: first one assigns weights candidate keyphrases based effective combination features position, term frequency, inverse document frequency second one assign weights candidate keyphrases using knowledge similarities structure characteristics keyphrases available memory (stored list keyphrases). efficient candidate keyphrase identification method first component proposed keyphrase extraction system also introduced paper. experimental results show proposed hybrid approach performs better state-of-the art keyphrase extraction approaches.",4 "image classification using svms: one-against-one vs one-against-all. support vector machines (svms) relatively new supervised classification technique land cover mapping community. roots statistical learning theory gained prominence robust, accurate effective even using small training sample. nature svms essentially binary classifiers, however, adopted handle multiple classification tasks common remote sensing studies. two approaches commonly used one-against-one (1a1) one-against-all (1aa) techniques. paper, approaches evaluated far impact implication land cover mapping. main finding research whereas 1aa technique predisposed yielding unclassified mixed pixels, resulting classification accuracy significantly different 1a1 approach. authors conclusion therefore ultimately choice technique adopted boils personal preference uniqueness dataset hand.",4 "local distance metric learning nearest neighbor algorithm. distance metric learning successful way enhance performance nearest neighbor classifier. cases, however, distribution data obey regular form may change different parts feature space. regarding that, paper proposes novel local distance metric learning method, namely local mahalanobis distance learning (lmdl), order enhance performance nearest neighbor classifier. lmdl considers neighborhood influence learns multiple distance metrics reduced set input samples. reduced set called prototypes try preserve local discriminative information much possible. proposed lmdl kernelized easily, significantly desirable case highly nonlinear data. quality well efficiency proposed method assesses set different experiments various datasets obtained results show ldml well kernelized version superior related state-of-the-art methods.",4 "greedy algorithms cone constrained optimization convergence guarantees. greedy optimization methods matching pursuit (mp) frank-wolfe (fw) algorithms regained popularity recent years due simplicity, effectiveness theoretical guarantees. mp fw address optimization linear span convex hull set atoms, respectively. paper, consider intermediate case optimization convex cone, parametrized conic hull generic atom set, leading first principled definitions non-negative mp algorithms give explicit convergence rates demonstrate excellent empirical performance. particular, derive sublinear ($\mathcal{o}(1/t)$) convergence general smooth convex objectives, linear convergence ($\mathcal{o}(e^{-t})$) strongly convex objectives, cases general sets atoms. furthermore, establish clear correspondence algorithms known algorithms mp fw literature. novel algorithms analyses target general atom sets general objective functions, hence directly applicable large variety learning settings.",4 "automatic dialect detection arabic broadcast speech. investigate different approaches dialect identification arabic broadcast speech, using phonetic, lexical features obtained speech recognition system, acoustic features using i-vector framework. studied generative discriminate classifiers, combined features using multi-class support vector machine (svm). validated results arabic/english language identification task, accuracy 100%. used features binary classifier discriminate modern standard arabic (msa) dialectal arabic, accuracy 100%. report results using proposed method discriminate five widely used dialects arabic: namely egyptian, gulf, levantine, north african, msa, accuracy 52%. discuss dialect identification errors context dialect code-switching dialectal arabic msa, compare error pattern manually labeled data, output classifier. also release train test data standard corpus dialect identification.",4 "plug-and-play priors bright field electron tomography sparse interpolation. many material biological samples scientific imaging characterized non-local repeating structures. studied using scanning electron microscopy electron tomography. sparse sampling individual pixels 2d image acquisition geometry, sparse sampling projection images large tilt increments tomography experiment, enable high speed data acquisition minimize sample damage caused electron beam. paper, present algorithm electron tomographic reconstruction sparse image interpolation exploits non-local redundancy images. adapt framework, termed plug-and-play (p&p) priors, solve imaging problems regularized inversion setting. power p&p approach allows wide array modern denoising algorithms used ""prior model"" tomography image interpolation. also present sufficient mathematical conditions ensure convergence p&p approach, use insights design new non-local means denoising algorithm. finally, demonstrate algorithm produces higher quality reconstructions simulated real electron microscope data, along improved convergence properties compared methods.",4 "revisiting master-slave architecture multi-agent deep reinforcement learning. many tasks artificial intelligence require collaboration multiple agents. exam deep reinforcement learning multi-agent domains. recent research efforts often take form two seemingly conflicting perspectives, decentralized perspective, agent supposed controller; centralized perspective, one assumes larger model controlling agents. regard, revisit idea master-slave architecture incorporating perspectives within one framework. hierarchical structure naturally leverages advantages one another. idea combining perspectives intuitive well motivated many real world systems, however, variety possible realizations, highlights three key ingredients, i.e. composed action representation, learnable communication independent reasoning. network designs facilitate explicitly, proposal consistently outperforms latest competing methods synthetic experiments applied challenging starcraft micromanagement tasks.",4 "knowledge-based data-driven modeling fuzzy rule-based systems: critical reflection. paper briefly elaborates development (applied) fuzzy logic taken place last couple decades, namely, complementation even replacement traditional knowledge-based approach fuzzy rule-based systems design data-driven one. argued classical rule-based modeling paradigm actually amenable knowledge-based approach, originally conceived, less apt data-driven model design. important reason prevents fuzzy (rule-based) systems leveraged large-scale applications flat structure rule bases, along local nature fuzzy rules limited ability express complex dependencies variables. motivates alternative approaches fuzzy systems modeling, functional dependencies represented flexibly compactly terms hierarchical structures.",4 "effective sampling: fast segmentation using robust geometric model fitting. identifying underlying models set data points contaminated noise outliers, leads highly complex multi-model fitting problem. problem posed clustering problem projection higher order affinities data points graph, clustered using spectral clustering. calculating possible higher order affinities computationally expensive. hence cases subset used. paper, propose effective sampling method obtain highly accurate approximation full graph required solve multi-structural model fitting problems computer vision. proposed method based observation usefulness graph segmentation improves distribution hypotheses (used build graph) approaches distribution actual parameters given data. paper, approximate actual parameter distribution using k-th order statistics based cost function samples generated using greedy algorithm coupled data sub-sampling strategy. experimental analysis shows proposed method accurate computationally efficient compared state-of-the-art robust multi-model fitting techniques. code publicly available https://github.com/ruwant/model-fitting-cbs.",4 "interpatient respiratory motion model transfer virtual reality simulations liver punctures. current virtual reality (vr) training simulators liver punctures often rely static 3d patient data use unrealistic (sinusoidal) periodic animation respiratory movement. existing methods animation breathing motion support simple mathematical patient-specific, estimated breathing models. however personalized breathing models new patient, heavily dose relevant expensive 4d data acquisition mandatory keyframe-based motion modeling. given reference 4d data, first model building stage using linear regression motion field modeling takes place. methodology shown allows transfer existing reference respiratory motion models 4d reference patient new static 3d patient. goal achieved using non-linear inter-patient registration warp one personalized 4d motion field model new 3d patient data. cost- dose-saving new method shown visually qualitative proof-of-concept study.",4 "genetic algorithms mentor-assisted evaluation function optimization. paper demonstrate genetic algorithms used reverse engineer evaluation function's parameters computer chess. results show using appropriate mentor, evolve program par top tournament-playing chess programs, outperforming two-time world computer chess champion. performance gain achieved evolving program smaller number parameters evaluation function mimic behavior superior mentor uses extensive evaluation function. principle, mentor-assisted approach could used wide range problems appropriate mentors available.",4 "measuring intelligence games. artificial general intelligence (agi) refers research aimed tackling full problem artificial intelligence, is, create truly intelligent agents. sets apart ai research aims solving relatively narrow domains, character recognition, motion planning, increasing player satisfaction games. know agent truly intelligent? common point reference agi community legg hutter's formal definition universal intelligence, appeal simplicity generality unfortunately incomputable. games various kinds commonly used benchmarks ""narrow"" ai research, considered many important properties. argue many properties carry testing general intelligence well. sketch testing could practically carried out. central part sketch extension universal intelligence deal finite time, use sampling space games expressed suitably biased game description language.",4 "jointly embedding relations mentions knowledge population. paper contributes joint embedding model predicting relations pair entities scenario relation inference. differs stand-alone approaches separately operate either knowledge bases free texts. proposed model simultaneously learns low-dimensional vector representations triplets knowledge repositories mentions relations free texts, leverage evidence resources make accurate predictions. use nell evaluate performance approach, compared cutting-edge methods. results extensive experiments show model achieves significant improvement relation extraction.",4 "stochastic gradient descent non-smooth optimization: convergence results optimal averaging schemes. stochastic gradient descent (sgd) one simplest popular stochastic optimization methods. already theoretically studied decades, classical analysis usually required non-trivial smoothness assumptions, apply many modern applications sgd non-smooth objective functions support vector machines. paper, investigate performance sgd without smoothness assumptions, well running average scheme convert sgd iterates solution optimal optimization accuracy. framework, prove rounds, suboptimality last sgd iterate scales o(log(t)/\sqrt{t}) non-smooth convex objective functions, o(log(t)/t) non-smooth strongly convex case. best knowledge, first bounds kind, almost match minimax-optimal rates obtainable appropriate averaging schemes. also propose new simple averaging scheme, attains optimal rates, also easily computed on-the-fly (in contrast, suffix averaging scheme proposed rakhlin et al. (2011) simple implement). finally, provide experimental illustrations.",4 "random forest system combination approach error detection digital dictionaries. digitizing print bilingual dictionary, whether via optical character recognition manual entry, inevitable errors introduced electronic version created. investigate automating process detecting errors xml representation digitized print dictionary using hybrid approach combines rule-based, feature-based, language model-based methods. investigate combining methods show using random forests promising approach. find isolation, unsupervised methods rival performance supervised methods. random forests typically require training data investigate apply random forests combine individual base methods unsupervised without requiring large amounts training data. experiments reveal empirically relatively small amount data sufficient potentially reduced specific selection criteria.",4 "independently recurrent neural network (indrnn): building longer deeper rnn. recurrent neural networks (rnns) widely used processing sequential data. however, rnns commonly difficult train due well-known gradient vanishing exploding problems hard learn long-term patterns. long short-term memory (lstm) gated recurrent unit (gru) developed address problems, use hyperbolic tangent sigmoid action functions results gradient decay layers. consequently, construction efficiently trainable deep network challenging. addition, neurons rnn layer entangled together behaviour hard interpret. address problems, new type rnn, referred independently recurrent neural network (indrnn), proposed paper, neurons layer independent connected across layers. shown indrnn easily regulated prevent gradient exploding vanishing problems allowing network learn long-term dependencies. moreover, indrnn work non-saturated activation functions relu (rectified linear unit) still trained robustly. multiple indrnns stacked construct network deeper existing rnns. experimental results shown proposed indrnn able process long sequences (over 5000 time steps), used construct deep networks (21 layers used experiment) still trained robustly. better performances achieved various tasks using indrnns compared traditional rnn lstm.",4 neutrality: necessity self-adaptation. self-adaptation used main paradigms evolutionary computation increase efficiency. claim basis self-adaptation use neutrality. absence external control neutrality allows variation search distribution without risk fitness loss.,13 "toward smart power grids: communication network design power grids synchronization. smart power grids, keeping synchronicity generators corresponding controls great importance. so, simple model employed terms swing equation represent interactions among dynamics generators feedback control. case communication network available, control done based transmitted measurements communication network. stability system denoted largest eigenvalue weighted sum laplacian matrices communication infrastructure power network. work, use graph theory model communication network graph problem. then, ant colony system (acs) employed optimum design graph synchronization power grids. performance evaluation proposed method 39-bus new england power system versus methods exhaustive search rayleigh quotient approximation indicates feasibility effectiveness method even large scale smart power grids.",4 "memcomputing np-complete problems polynomial time using polynomial resources collective states. memcomputing novel non-turing paradigm computation uses interacting memory cells (memprocessors short) store process information physical platform. recently proved mathematically memcomputing machines computational power non-deterministic turing machines. therefore, solve np-complete problems polynomial time and, using appropriate architecture, resources grow polynomially input size. reason computational power stems properties inspired brain shared universal memcomputing machine, particular intrinsic parallelism information overhead, namely capability compressing information collective state memprocessor network. here, show experimental demonstration actual memcomputing architecture solves np-complete version subset-sum problem one step composed number memprocessors scales linearly size problem. fabricated architecture using standard microelectronic technology easily realized laboratory setting. even though particular machine presented eventually limited noise--and thus require error-correcting codes scale arbitrary number memprocessors--it represents first proof-of-concept machine capable working collective state interacting memory cells, unlike present-day single-state machines built using von neumann architecture.",4 "model-based clustering classification functional data. problem complex data analysis central topic modern statistical science learning systems becoming broader interest increasing prevalence high-dimensional data. challenge develop statistical models autonomous algorithms able acquire knowledge raw data exploratory analysis, achieved clustering techniques make predictions future data via classification (i.e., discriminant analysis) techniques. latent data models, including mixture model-based approaches one popular successful approaches unsupervised context (i.e., clustering) supervised one (i.e, classification discrimination). although traditionally tools multivariate analysis, growing popularity considered framework functional data analysis (fda). fda data analysis paradigm individual data units functions (e.g., curves, surfaces), rather simple vectors. many areas application, analyzed data indeed often available form discretized values functions curves (e.g., time series, waveforms) surfaces (e.g., 2d-images, spatio-temporal data). functional aspect data adds additional difficulties compared case classical multivariate (non-functional) data analysis. review present approaches model-based clustering classification functional data. derive well-established statistical models along efficient algorithmic tools address problems regarding clustering classification high-dimensional data, including heterogeneity, missing information, dynamical hidden structure. presented models algorithms illustrated real-world functional data analysis problems several application area.",19 "bayesian crack detection ultra high resolution multimodal images paintings. preservation cultural heritage paramount importance. thanks recent developments digital acquisition techniques, powerful image analysis algorithms developed useful non-invasive tools assist restoration preservation art. paper propose semi-supervised crack detection method used high-dimensional acquisitions paintings coming different modalities. dataset consists recently acquired collection images ghent altarpiece (1432), one northern europe's important art masterpieces. goal build classifier able discern crack pixels background consisting non-crack pixels, making optimal use information provided modality. accomplish employ recently developed non-parametric bayesian classifier, uses tensor factorizations characterize conditional probability. prior placed parameters factorization every possible interaction predictors allowed still identifying sparse subset among predictors. proposed bayesian classifier, refer conditional bayesian tensor factorization cbtf, assessed visually comparing classification results random forest (rf) algorithm.",4 "provable benefits representation learning. general consensus learning representations useful variety reasons, e.g. efficient use labeled data (semi-supervised learning), transfer learning understanding hidden structure data. popular techniques representation learning include clustering, manifold learning, kernel-learning, autoencoders, boltzmann machines, etc. study relative merits techniques, essential formalize definition goals representation learning, become instances definition. paper introduces formal framework also formalizes utility learning representation. related previous bayesian notions, new twists. show usefulness framework exhibiting simple natural settings -- linear mixture models loglinear models, power representation learning formally shown. examples, representation learning performed provably efficiently plausible assumptions (despite np-hard), furthermore: (i) greatly reduces need labeled data (semi-supervised learning) (ii) allows solving classification tasks simpler approaches like nearest neighbors require much data (iii) powerful manifold learning methods.",4 "anisotropic diffusion details enhancement multi-exposure image fusion. develop multiexposure image fusion method based texture features, exploits edge preserving intraregion smoothing property nonlinear diffusion filters based partial differential equations (pde). captured multiexposure image series, first decompose images base layers detail layers extract sharp details fine details, respectively. magnitude gradient image intensity utilized encourage smoothness homogeneous regions preference inhomogeneous regions. then, considered texture features base layer generate mask (i.e., decision mask) guides fusion base layers multiresolution fashion. finally, well-exposed fused image obtained combines fused base layer detail layers scale across input exposures. proposed algorithm skipping complex high dynamic range image (hdri) generation tone mapping steps produce detail preserving image display standard dynamic range display devices. moreover, technique effective blending flash/no-flash image pair multifocus images, is, images focused different targets.",4 "two phase $q-$learning bidding-based vehicle sharing. consider one-way vehicle sharing systems customers rent car one station drop another. problem address optimize distribution cars, quality service, pricing rentals appropriately. propose bidding approach inspired auctions takes account significant uncertainty inherent problem data (e.g., pick-up drop-off locations, time requests, duration trips). specifically, contrast current vehicle sharing systems, operator set prices. instead, customers submit bids operator decides whether rent not. operator even accept negative bids motivate drivers rebalance available cars unpopular destinations within city. model operator's sequential decision-making problem \emph{constrained markov decision problem} (cmdp) propose rigorously analyze novel two phase $q$-learning algorithm solution. numerical experiments presented discussed.",4 "additive models trend filtering. consider additive models built trend filtering, i.e., additive models whose components regularized (discrete) total variation $(k+1)$st (discrete) derivative, chosen integer $k \geq 0$. results $k$th degree piecewise polynomial components, (e.g., $k=0$ gives piecewise constant components, $k=1$ gives piecewise linear, $k=2$ gives piecewise quadratic, etc.). univariate nonparametric regression, localized nature total variation regularizer used trend filtering shown produce estimates superior local adaptivity smoothing splines (and linear smoothers, generally) (tibshirani [2014]). further, structured nature regularizer shown lead highly efficient computational routines trend filtering (kim et al. [2009], ramdas tibshirani [2016]). paper, argue properties carry additive models setting. derive fast error rates additive trend filtering estimates, prove rates minimax optimal underlying function additive component functions whose derivatives bounded variation. show rates unattainable additive smoothing splines (and additive models built linear smoothers, general). argue backfitting provides efficient algorithm additive trend filtering, built around fast univariate trend filtering solvers; moreover, describe modified backfitting procedure whose iterations run parallel. finally, conduct experiments examine empirical properties additive trend filtering, outline possible extensions.",19 behavior-based approach multi-agent q-learning autonomous exploration. use mobile robots popular world mainly autonomous explorations hazardous/ toxic unknown environments. exploration effective efficient explorations unknown environment aided learning past experiences. currently reinforcement learning getting acceptances implementing learning robots system-environment interactions. learning implemented using concept single-agent multiagent. paper describes multiagent approach implementing type reinforcement learning using priority based behaviour-based architecture. proposed methodology successfully tested indoor outdoor environments.,4 "properties applications programs monotone convex constraints. study properties programs monotone convex constraints. extend formalisms concepts results normal logic programming. include notions strong uniform equivalence characterizations, tight programs fages lemma, program completion loop formulas. results provide abstract account properties recent extensions logic programming aggregates, especially formalism lparse programs. imply method compute stable models lparse programs means off-the-shelf solvers pseudo-boolean constraints, often much faster smodels system.",4 "depth-gated lstm. short note, present extension long short-term memory (lstm) neural networks using depth gate connect memory cells adjacent layers. introduces linear dependence lower upper layer recurrent units. importantly, linear dependence gated gating function, call depth gate. gate function lower layer memory cell, input past memory cell layer. conducted experiments verified new architecture lstms able improve machine translation language modeling performances.",4 "stratified transfer learning cross-domain activity recognition. activity recognition, often expensive time-consuming acquire sufficient activity labels. solve problem, transfer learning leverages labeled samples source domain annotate target domain none labels. existing approaches typically consider learning global domain shift ignoring intra-affinity classes, hinder performance algorithms. paper, propose novel general cross-domain learning framework exploit intra-affinity classes perform intra-class knowledge transfer. proposed framework, referred stratified transfer learning (stl), dramatically improve classification accuracy cross-domain activity recognition. specifically, stl first obtains pseudo labels target domain via majority voting technique. then, performs intra-class knowledge transfer iteratively transform domains subspaces. finally, labels target domain obtained via second annotation. evaluate performance stl, conduct comprehensive experiments three large public activity recognition datasets~(i.e. opportunity, pamap2, uci dsads), demonstrates stl significantly outperforms state-of-the-art methods w.r.t. classification accuracy (improvement 7.68%). furthermore, extensively investigate performance stl across different degrees similarities activity levels domains. also discuss potential stl pervasive computing applications provide empirical experience future research.",4 "network-size independent covering number bounds deep networks. give covering number bound deep learning networks independent size network. key simple analysis linear classifiers, rotating data affect covering number. thus, ignore rotation part layer's linear transformation, get covering number bound concentrating scaling part.",4 "revisiting video saliency: large-scale benchmark new model. work, contribute video saliency research two ways. first, introduce new benchmark predicting human eye movements dynamic scene free-viewing, long-time urged field. dataset, named dhf1k (dynamic human fixation), consists 1k high-quality, elaborately selected video sequences spanning large range scenes, viewpoints, motions, object types background complexity. existing video saliency datasets lack variety generality common dynamic scenes fall short covering challenging situations unconstrained environments. contrast, dhf1k makes significant leap terms scalability, diversity difficulty, expected boost video saliency modeling. second, propose novel video saliency model augments cnn-lstm network architecture attention mechanism enable fast, end-to-end saliency learning. attention mechanism explicitly encodes static saliency information, thus allowing lstm focus learning flexible temporal saliency representation across successive frames. design fully leverages existing large-scale static fixation datasets, avoids overfitting, significantly improves training efficiency testing performance. thoroughly examine performance model, respect state art saliency models, three large-scale datasets (i.e., dhf1k, hollywood2, ucf sports). experimental results 1.2k testing videos containing 400k frames demonstrate model outperforms competitors.",4 "data-driven color augmentation techniques deep skin image analysis. dermoscopic skin images often obtained different imaging devices, varying acquisition conditions. work, instead attempting perform intensity color normalization, propose leverage computational color constancy techniques build artificial data augmentation technique suitable kind images. specifically, apply \emph{shades gray} color constancy technique color-normalize entire training set images, retaining estimated illuminants. draw one sample distribution training set illuminants apply normalized image. employ technique training two deep convolutional neural networks tasks skin lesion segmentation skin lesion classification, context isic 2017 challenge without using external dermatologic image set. results validation set promising, supplemented extended results hidden test set available.",4 "real-time novelty detector mobile robot. recognising new unusual features environment ability potentially useful robot. paper demonstrates algorithm achieves task learning internal representation `normality' sonar scans taken robot explores environment. model environment used evaluate novelty sonar scan presented relation model. stimuli seen before, therefore novelty, highlighted filter. filter ability forget features learned, stimuli seen rarely recover response time. number robot experiments presented demonstrate operation filter.",4 "graph regularized tensor sparse coding image representation. sparse coding (sc) unsupervised learning scheme received increasing amount interests recent years. however, conventional sc vectorizes input images, destructs intrinsic spatial structures images. paper, propose novel graph regularized tensor sparse coding (gtsc) image representation. gtsc preserves local proximity elementary structures image adopting newly proposed tubal-tensor representation. simultaneously, considers intrinsic geometric properties imposing graph regularization successfully applied uncover geometric distribution image data. moreover, returned sparse representations gtsc better physical explanations key operation (i.e., circular convolution) tubal-tensor model preserves shifting invariance property. experimental results image clustering demonstrate effectiveness proposed scheme.",4 "parent oriented teacher selection causes language diversity. evolutionary model emergence diversity language developed. investigated effects two real life observations, namely, people prefer people communicate well, people interact people physically close other. clearly groups relatively small compared entire population. restrict selection teachers small groups, called imitation sets, around parents. child learns language teacher selected within imitation set parent. result, subcommunities languages developed. within subcommunity comprehension found high. number languages related relative size imitation set power law.",4 "livdet 2017 fingerprint liveness detection competition 2017. fingerprint presentation attack detection (fpad) deals distinguishing images coming artificial replicas fingerprint characteristic, made materials like silicone, gelatine latex, images coming alive fingerprints. images captured modern scanners, typically relying solid-state optical technologies. since 2009, fingerprint liveness detection competition (livdet) aims assess performance state-of-the-art algorithms according rigorous experimental protocol and, time, simple overview basic achievements. competition open academics research centers companies work field. positive, increasing trend participants number, supports success initiative, confirmed even year: 17 algorithms submitted competition, larger involvement companies academies. means topic relevant sides, points lot work must done terms fundamental applied research.",4 "optimal spectral transportation application music transcription. many spectral unmixing methods rely non-negative decomposition spectral data onto dictionary spectral templates. particular, state-of-the-art music transcription systems decompose spectrogram input signal onto dictionary representative note spectra. typical measures fit used quantify adequacy decomposition compare data template entries frequency-wise. such, small displacements energy frequency bin another well variations timber disproportionally harm fit. address issues means optimal transportation propose new measure fit treats frequency distributions energy holistically opposed frequency-wise. building harmonic nature sound, new measure invariant shifts energy harmonically-related frequencies, well small local displacements energy. equipped new measure fit, dictionary note templates considerably simplified set dirac vectors located target fundamental frequencies (musical pitch values). turns gives ground fast simple decomposition algorithm achieves state-of-the-art performance real musical data.",19 "recovering hard-to-find object instances sampling context-based object proposals. paper focus improving object detection performance terms recall. propose post-detection stage explore image objective recovering missed detections. exploration performed sampling object proposals image. analyze four different strategies perform sampling, giving special attention strategies exploit spatial relations objects. addition, propose novel method discover higher-order relations groups objects. experiments challenging kitti dataset show proposed relations-based proposal generation strategies help improving recall cost relatively low amount object proposals.",4 "falsification future performance. information-theoretically reformulate two measures capacity statistical learning theory: empirical vc-entropy empirical rademacher complexity. show capacity measures count number hypotheses dataset learning algorithm falsifies finds classifier repertoire minimizing empirical risk. follows future performance predictors unseen data controlled part many hypotheses learner falsifies. corollary show empirical vc-entropy quantifies message length true hypothesis optimal code particular probability distribution, so-called actual repertoire.",19 "improved neural text attribute transfer non-parallel data. text attribute transfer using non-parallel data requires methods perform disentanglement content linguistic attributes. work, propose multiple improvements existing approaches enable encoder-decoder framework cope text attribute transfer non-parallel data. perform experiments sentiment transfer task using two datasets. datasets, proposed method outperforms strong baseline two three employed evaluation metrics.",4 "generalized two-dimensional linear discriminant analysis regularization. recent advances show two-dimensional linear discriminant analysis (2dlda) successful matrix based dimensionality reduction method. however, 2dlda may encounter singularity issue theoretically sensitivity outliers. paper, generalized lp-norm 2dlda framework regularization arbitrary $p>0$ proposed, named g2dlda. mainly two contributions g2dlda: one g2dlda model uses arbitrary lp-norm measure between-class within-class scatter, hence proper $p$ selected achieve robustness. one introducing extra regularization term, g2dlda achieves better generalization performance, solves singularity problem. addition, g2dlda solved series convex problems equality constraint, closed solution single problem. convergence guaranteed theoretically $1\leq p\leq2$. preliminary experimental results three contaminated human face databases show effectiveness proposed g2dlda.",4 "complex embeddings simple link prediction. statistical relational learning, link prediction problem key automatically understand structure large knowledge bases. previous studies, propose solve problem latent factorization. however, make use complex valued embeddings. composition complex embeddings handle large variety binary relations, among symmetric antisymmetric relations. compared state-of-the-art models neural tensor network holographic embeddings, approach based complex embeddings arguably simpler, uses hermitian dot product, complex counterpart standard dot product real vectors. approach scalable large datasets remains linear space time, consistently outperforming alternative approaches standard link prediction benchmarks.",4 "groupwise maximin fair allocation indivisible goods. study problem allocating indivisible goods among n agents fair manner. problem, maximin share (mms) well-studied solution concept provides fairness threshold. specifically, maximin share defined minimum utility agent guarantee asked partition set goods n bundles remaining (n-1) agents pick bundles adversarially. allocation deemed fair every agent gets bundle whose valuation least maximin share. even though maximin shares provide natural benchmark fairness, drawbacks and, particular, sufficient rule unsatisfactory allocations. motivated considerations, work define stronger notion fairness, called groupwise maximin share guarantee (gmms). gmms, require maximin share guarantee achieved respect grand bundle, also among subgroups agents. hence, solution concept strengthens mms provides ex-post fairness guarantee. show specific settings, gmms allocations always exist. also establish existence approximate gmms allocations additive valuations, develop polynomial-time algorithm find allocations. moreover, establish scale fairness wherein show gmms implies approximate envy freeness. finally, empirically demonstrate existence gmms allocations large set randomly generated instances. set instances, additionally show algorithm achieves approximation factor better established, worst-case bound.",4 "multi-scale continuous crfs sequential deep networks monocular depth estimation. paper addresses problem depth estimation single still image. inspired recent works multi- scale convolutional neural networks (cnn), propose deep model fuses complementary information derived multiple cnn side outputs. different previous methods, integration obtained means continuous conditional random fields (crfs). particular, propose two different variations, one based cascade multiple crfs, unified graphical model. designing novel cnn implementation mean-field updates continuous crfs, show proposed models regarded sequential deep networks training performed end-to-end. extensive experimental evaluation demonstrate effective- ness proposed approach establish new state art results publicly available datasets.",4 "modelling scene dependent imaging cameras deep neural network. present novel deep learning framework models scene dependent image processing inside cameras. often called radiometric calibration, process recovering raw images processed images (jpeg format srgb color space) essential many computer vision tasks rely physically accurate radiance values. previous works rely deterministic imaging model color transformation stays regardless scene thus applied images taken manual mode. paper, propose data-driven approach learn scene dependent locally varying image processing inside cameras automode. method incorporates global local scene context pixel-wise features via multi-scale pyramid learnable histogram layers. results show model imaging pipeline different cameras operate automode accurately directions (from raw srgb, srgb raw) show apply method improve performance image deblurring.",4 "cognitive architecture based learning classifier system spiking classifiers. learning classifier systems (lcs) population-based reinforcement learners originally designed model various cognitive phenomena. paper presents explicitly cognitive lcs using spiking neural networks classifiers, providing classifier measure temporal dynamism. employ constructivist model growth neurons synaptic connections, permits genetic algorithm (ga) automatically evolve sufficiently-complex neural structures. spiking classifiers coupled temporally-sensitive reinforcement learning algorithm, allows system perform temporal state decomposition appropriately rewarding ""macro-actions,"" created chaining together multiple atomic actions. combination temporal reinforcement learning neural information processing shown outperform benchmark neural classifier systems, successfully solve robotic navigation task.",4 "indexability, concentration, vc theory. degrading performance indexing schemes exact similarity search high dimensions long since linked histograms distributions distances 1-lipschitz functions getting concentrated. discuss observation framework phenomenon concentration measure structures high dimension vapnik-chervonenkis theory statistical learning.",4 "convolutional spike timing dependent plasticity based feature learning spiking neural networks. brain-inspired learning models attempt mimic cortical architecture computations performed neurons synapses constituting human brain achieve efficiency cognitive tasks. work, present convolutional spike timing dependent plasticity based feature learning biologically plausible leaky-integrate-and-fire neurons spiking neural networks (snns). use shared weight kernels trained encode representative features underlying input patterns thereby improving sparsity well robustness learning model. demonstrate proposed unsupervised learning methodology learns several visual categories object recognition fewer number examples outperforms traditional fully-connected snn architectures yielding competitive accuracy. additionally, observe learning model performs out-of-set generalization making proposed biologically plausible framework viable efficient architecture future neuromorphic applications.",4 "network unfolding map edge dynamics modeling. emergence collective dynamics neural networks mechanism animal human brain information processing. paper, develop computational technique using distributed processing elements complex network, called particles, solve semi-supervised learning problems. three actions govern particles' dynamics: generation, walking, absorption. labeled vertices generate new particles compete rival particles edge domination. active particles randomly walk network absorbed either rival vertex edge currently dominated rival particles. result model evolution consists sets edges arranged label dominance. set tends form connected subnetwork represent data class. although intrinsic dynamics model stochastic one, prove exists deterministic version largely reduced computational complexity; specifically, linear growth. furthermore, edge domination process corresponds unfolding map way edges ""stretch"" ""shrink"" according vertex-edge dynamics. consequently, unfolding effect summarizes relevant relationships vertices uncovered data classes. proposed model captures important details connectivity patterns vertex-edge dynamics evolution, contrast previous approaches focused vertex edge dynamics. computer simulations reveal new model identify nonlinear features real artificial data, including boundaries distinct classes overlapping structures data.",4 "mlbench: good machine learning clouds binary classification tasks structured data?. conduct empirical study machine learning functionalities provided major cloud service providers, call machine learning clouds. machine learning clouds hold promise hiding sophistication running large-scale machine learning: instead specifying run machine learning task, users specify machine learning task run cloud figures rest. raising level abstraction, however, rarely comes free - performance penalty possible. good, then, current machine learning clouds real-world machine learning workloads? study question focus binary classication problems. present mlbench, novel benchmark constructed harvesting datasets kaggle competitions. compare performance top winning code available kaggle running machine learning clouds azure amazon mlbench. comparative study reveals strength weakness existing machine learning clouds points potential future directions improvement.",4 "autoencoding beyond pixels using learned similarity metric. present autoencoder leverages learned representations better measure similarities data space. combining variational autoencoder generative adversarial network use learned feature representations gan discriminator basis vae reconstruction objective. thereby, replace element-wise errors feature-wise errors better capture data distribution offering invariance towards e.g. translation. apply method images faces show outperforms vaes element-wise similarity measures terms visual fidelity. moreover, show method learns embedding high-level abstract visual features (e.g. wearing glasses) modified using simple arithmetic.",4 "dilated fcn multi-agent 2d/3d medical image registration. 2d/3d image registration align 3d volume 2d x-ray images challenging problem due ill-posed nature various artifacts presented 2d x-ray images. paper, propose multi-agent system auto attention mechanism robust efficient 2d/3d image registration. specifically, individual agent trained dilated fully convolutional network (fcn) perform registration markov decision process (mdp) observing local region, final action taken based proposals multiple agents weighted corresponding confidence levels. contributions paper threefold. first, formulate 2d/3d registration mdp observations, actions, rewards properly defined respect x-ray imaging systems. second, handle various artifacts 2d x-ray images, multiple local agents employed efficiently via fcn-based structures, auto attention mechanism proposed favor proposals regions reliable visual cues. third, dilated fcn-based training mechanism proposed significantly reduce degree freedom simulation registration environment, drastically improve training efficiency order magnitude compared standard cnn-based training method. demonstrate proposed method achieves high robustness spine cone beam computed tomography data low signal-to-noise ratio data minimally invasive spine surgery severe image artifacts occlusions presented due metal screws guide wires, outperforming state-of-the-art methods (single agent-based optimization-based) large margin.",4 "metaheuristic optimization: algorithm analysis open problems. metaheuristic algorithms becoming important part modern optimization. wide range metaheuristic algorithms emerged last two decades, many metaheuristics particle swarm optimization becoming increasingly popular. despite popularity, mathematical analysis algorithms lacks behind. convergence analysis still remains unsolved majority metaheuristic algorithms, efficiency analysis equally challenging. paper, intend provide overview convergence efficiency studies metaheuristics, try provide framework analyzing metaheuristics terms convergence efficiency. form basis analyzing algorithms. also outline open questions research topics.",12 "deep image prior. deep convolutional networks become popular tool image generation restoration. generally, excellent performance imputed ability learn realistic image priors large number example images. paper, show that, contrary, structure generator network sufficient capture great deal low-level image statistics prior learning. order so, show randomly-initialized neural network used handcrafted prior excellent results standard inverse problems denoising, super-resolution, inpainting. furthermore, prior used invert deep neural representations diagnose them, restore images based flash-no flash input pairs. apart diverse applications, approach highlights inductive bias captured standard generator network architectures. also bridges gap two popular families image restoration methods: learning-based methods using deep convolutional networks learning-free methods based handcrafted image priors self-similarity. code supplementary material available https://dmitryulyanov.github.io/deep_image_prior .",4 "high-throughput language-agnostic entity disambiguation linking user generated data. entity disambiguation linking (edl) task matches entity mentions text unique knowledge base (kb) identifier wikipedia freebase id. plays critical role construction high quality information network, leveraged variety information retrieval nlp tasks text categorization document tagging. edl complex challenging problem due ambiguity mentions real world text multi-lingual. moreover, edl systems need high throughput lightweight order scale large datasets run off-the-shelf machines. importantly, systems need able extract disambiguate dense annotations data order enable information retrieval extraction task running data efficient accurate. order address challenges, present lithium edl system algorithm - high-throughput, lightweight, language-agnostic edl system extracts correctly disambiguates 75% entities state-of-the-art edl systems significantly faster them.",4 "new approach two-view motion segmentation using global dimension minimization. present new approach rigid-body motion segmentation two views. use previously developed nonlinear embedding two-view point correspondences 9-dimensional space identify different motions segmenting lower-dimensional subspaces. order overcome nonuniform distributions along subspaces, whose dimensions unknown, suggest novel concept global dimension minimization clustering subspaces theoretical motivation. propose fast projected gradient algorithm minimizing global dimension thus segmenting motions 2-views. develop outlier detection framework around proposed method, present state-of-the-art results outlier-free outlier-corrupted two-view data segmenting motion.",4 "video retrieval based deep convolutional neural network. recently, enormous growth online videos, fast video retrieval research received increasing attention. extension image hashing techniques, traditional video hashing methods mainly depend hand-crafted features transform real-valued features binary hash codes. videos provide far diverse complex visual information images, extracting features videos much challenging images. therefore, high-level semantic features represent videos needed rather low-level hand-crafted methods. paper, deep convolutional neural network proposed extract high-level semantic features binary hash function integrated framework achieve end-to-end optimization. particularly, approach also combines triplet loss function preserves relative similarity difference videos classification loss function optimization objective. experiments performed two public datasets results demonstrate superiority proposed method compared state-of-the-art video retrieval methods.",4 "learning simulated unsupervised images adversarial training. recent progress graphics, become tractable train models synthetic images, potentially avoiding need expensive annotations. however, learning synthetic images may achieve desired performance due gap synthetic real image distributions. reduce gap, propose simulated+unsupervised (s+u) learning, task learn model improve realism simulator's output using unlabeled real data, preserving annotation information simulator. develop method s+u learning uses adversarial network similar generative adversarial networks (gans), synthetic images inputs instead random vectors. make several key modifications standard gan algorithm preserve annotations, avoid artifacts, stabilize training: (i) 'self-regularization' term, (ii) local adversarial loss, (iii) updating discriminator using history refined images. show enables generation highly realistic images, demonstrate qualitatively user study. quantitatively evaluate generated images training models gaze estimation hand pose estimation. show significant improvement using synthetic images, achieve state-of-the-art results mpiigaze dataset without labeled real data.",4 "learning spatio-temporal features partial expression sequences on-the-fly prediction. spatio-temporal feature encoding essential encoding facial expression dynamics video sequences. test time, spatio-temporal encoding methods assume temporally segmented sequence fed learned model, could require prediction wait full sequence available auxiliary task performs temporal segmentation. causes delay predicting expression. interactive setting, affective interactive agents, delay prediction could tolerated. therefore, training model accurately predict facial expression ""on-the-fly"" (as fed system) essential. paper, propose new spatio-temporal feature learning method, would allow prediction partial sequences. such, prediction could performed on-the-fly. proposed method utilizes estimated expression intensity generate dense labels, used regulate prediction model training novel objective function. results, learned spatio-temporal features robustly predict expression partial (incomplete) expression sequences, on-the-fly. experimental results showed proposed method achieved higher recognition rates compared state-of-the-art methods datasets. importantly, results verified proposed method improved prediction frames partial expression sequence inputs.",4 "planit: crowdsourcing approach learning plan paths large scale preference feedback. consider problem learning user preferences robot trajectories environments rich objects humans. challenging criterion defining good trajectory varies users, tasks interactions environment. represent trajectory preferences using cost function robot learns uses generate good trajectories new environments. design crowdsourcing system - planit, non-expert users label segments robot's trajectory. planit allows us collect large amount user feedback, using weak noisy labels planit learn parameters model. test approach 122 different environments robotic navigation manipulation tasks. extensive experiments show learned cost function generates preferred trajectories human environments. crowdsourcing system publicly available visualization learned costs providing preference feedback: \url{http://planit.cs.cornell.edu}",4 "forecasting indian rupee (inr) / us dollar (usd) currency exchange rate using artificial neural network. large part workforce, growing every day, originally india. india one second largest populations world, lot offer terms jobs. sheer number workers makes formidable travelling force well, easily picking employment english speaking countries. beginning economic crises since 2008 september, many indians return homeland, substantial impression indian rupee (inr) liken us dollar (usd). using numerational knowledge based techniques forecasting proved highly successful present time. purpose paper examine effects several important neural network factors model fitting forecasting behaviours. paper, artificial neural network successfully used exchange rate forecasting. paper examines effects number inputs hidden nodes size training sample in-sample out-of-sample performance. indian rupee (inr) / us dollar (usd) used detailed examinations. number input nodes greater impact performance number hidden nodes, large number observations reduce forecast errors.",4 "comparison several sparse recovery methods low rank matrices random samples. paper, investigate efficacy imat (iterative method adaptive thresholding) recovering sparse signal (parameters) linear models missing data. sparse recovery rises compressed sensing machine learning problems various applications necessitating viable reconstruction methods specifically work big data. paper focus comparing power imat reconstruction desired sparse signal lasso. additionally, assume model random missing information. missing data recently interest big data machine learning problems since appear many cases including limited medical imaging datasets, hospital datasets, massive mimo. dominance imat well-known lasso taken account different scenarios. simulations numerical results also provided verify arguments.",4 "structure based extended resolution constraint programming. nogood learning powerful approach reducing search constraint programming (cp) solvers. current state art, called lazy clause generation (lcg), uses resolution derive nogoods expressing reasons search failure. nogoods prune parts search tree, producing exponential speedups wide variety problems. nogood learning solvers seen resolution proof systems. stronger proof system, faster solve cp problem. recently shown proof system used lcg least strong general resolution. however, stronger proof systems \emph{extended resolution} exist. extended resolution allows literals expressing arbitrary logical concepts existing variables introduced allow exponentially smaller proofs general resolution. primary problem using extended resolution figure exactly literals useful introduce. paper, show use structural information contained cp model order introduce useful literals, translate significant speedups range problems.",4 "rough sets matroidal contraction. rough sets efficient data pre-processing data mining. generalization linear independence vector spaces, matroids provide well-established platforms greedy algorithms. paper, apply rough sets matroids study contraction dual corresponding matroid. first, equivalence relation universe, matroidal structure rough set established lower approximation operator. second, dual matroid properties independent sets, bases rank function investigated. finally, relationships contraction dual matroid complement single point set contraction dual matroid complement equivalence class point studied.",4 "detecting early signs depressive manic episodes patients bipolar disorder using signature-based model. recurrent major mood episodes subsyndromal mood instability cause substantial disability patients bipolar disorder. early identification mood episodes enabling timely mood stabilisation important clinical goal. recent technological advances allow prospective reporting mood real time enabling accurate, efficient data capture. complex nature data streams combination challenge deriving meaning missing data mean pose significant analytic challenge. signature method derived stochastic analysis ability capture important properties complex ordered time series data. explore whether onset episodes mania depression identified using self-reported mood data.",19 "evolution ideas: novel memetic algorithm based semantic networks. paper presents new type evolutionary algorithm (ea) based concept ""meme"", individuals forming population represented semantic networks fitness measure defined function represented knowledge. work classified novel memetic algorithm (ma), given (1) units culture, information, undergoing variation, transmission, selection, close original sense memetics introduced dawkins; (2) different existing ma, idea memetics utilized means local refinement individual learning classical global sampling ea. individual pieces information represented simple semantic networks directed graphs concepts binary relations, going variation memetic versions operators crossover mutation, utilize knowledge commonsense knowledge bases. evaluating introductory work, interesting fitness measure, focus using structure mapping theory analogical reasoning psychology evolve pieces information analogous given base information. considering possible fitness measures, proposed representation algorithm serve computational tool modeling memetic theories knowledge, evolutionary epistemology cultural selection theory.",4 "tosca: operationalizing commitments information protocols. notion commitment widely studied high-level abstraction modeling multiagent interaction. important challenge supporting flexible decentralized enactments commitment specifications. paper, combine recent advances specifying commitments information protocols. specifically, contribute tosca, technique automatically synthesizing information protocols commitment specifications. main result synthesized protocols support commitment alignment, idea agents must make compatible inferences commitments despite decentralization.",4 "meta-learning within projective simulation. learning models artificial intelligence nowadays perform well large variety tasks. however, practice different task environments best handled different learning models, rather single, universal, approach. non-trivial models thus require adjustment several many learning parameters, often done case-by-case basis external party. meta-learning refers ability agent autonomously dynamically adjust learning parameters, meta-parameters. work show projective simulation, recently developed model artificial intelligence, naturally extended account meta-learning reinforcement learning settings. projective simulation approach based random walk process network clips. suggested meta-learning scheme builds upon design employs clip networks monitor agent's performance adjust meta-parameters ""on fly"". distinguish ""reflexive adaptation"" ""adaptation learning"", show utility approaches. addition, trade-off flexibility learning-time addressed. extended model examined three different kinds reinforcement learning tasks, agent different optimal values meta-parameters, shown perform well, reaching near-optimal optimal success rates them, without ever needing manually adjust meta-parameter.",4 "image denoising using optimally weighted bilateral filters: sure fast approach. bilateral filter known quite effective denoising images corrupted small dosages additive gaussian noise. denoising performance filter, however, known degrade quickly increase noise level. several adaptations filter proposed literature address shortcoming, often substantial computational overhead. paper, report simple pre-processing step substantially improve denoising performance bilateral filter, almost additional cost. modified filter designed robust large noise levels, often tends perform poorly certain noise threshold. get best original modified filter, propose combine weighted fashion, weights chosen minimize (a surrogate of) oracle mean-squared-error (mse). optimally-weighted filter thus guaranteed perform better either component filters terms mse, noise levels. also provide fast algorithm weighted filtering. visual quantitative denoising results standard test images reported demonstrate improvement original filter significant visually terms psnr. moreover, denoising performance optimally-weighted bilateral filter competitive computation-intensive non-local means filter.",4 "oriented bounding boxes using multiresolution contours fast interference detection arbitrary geometry objects. interference detection arbitrary geometric objects trivial task due heavy computational load imposed implementation issues. hierarchically structured bounding boxes help us quickly isolate contour segments interference. paper, new approach introduced treat interference detection problem involving representation arbitrary shaped objects. proposed method relies upon searching best possible way represent contours means hierarchically structured rectangular oriented bounding boxes. technique handles 2d objects boundaries defined closed b-spline curves roughness details. oriented box adapted fitted segments contour using second order statistical indicators elements segments object contour multiresolution framework. method efficient robust comes 2d animations real time. deal smooth curves polygonal approximations well results present illustrate performance new method.",4 "text-based lstm networks automatic music composition. paper, introduce new methods discuss results text-based lstm (long short-term memory) networks automatic music composition. proposed network designed learn relationships within text documents represent chord progressions drum tracks two case studies. experiments, word-rnns (recurrent neural networks) show good results cases, character-based rnns (char-rnns) succeed learn chord progressions. proposed system used fully automatic composition semi-automatic systems help humans compose music controlling diversity parameter model.",4 "testing homogeneity kernel fisher discriminant analysis. propose investigate test statistics testing homogeneity reproducing kernel hilbert spaces. asymptotic null distributions null hypothesis derived, consistency fixed local alternatives assessed. finally, experimental evidence performance proposed approach artificial data speaker verification task provided.",19 "optimality tree-reweighted max-product message-passing. tree-reweighted max-product (trw) message passing modified form ordinary max-product algorithm attempting find minimal energy configurations markov random field cycles. trw fixed point satisfying strong tree agreement condition, algorithm outputs configuration provably optimal. paper, focus case binary variables pairwise couplings, establish stronger properties trw fixed points satisfy milder condition weak tree agreement (wta). first, demonstrate possible identify part optimal solution|i.e., provably optimal solution subset nodes| without knowing complete solution. second, show submodular functions, wta fixed point always yields globally optimal solution. establish binary variables, wta fixed point always achieves global maximum linear programming relaxation underlying trw method.",4 "3d planar patch extraction stereo using probabilistic region growing. article presents novel 3d planar patch extraction method using probabilistic region growing algorithm. method works simultaneously initiating multiple planar patches seed points, latter determined intensity-based 2d segmentation algorithm stereo-pair images. patches grown incrementally parallel 3d scene points considered membership, using probabilistic distance likelihood measure. addition, incorporated prior information based noise model 2d images scene configuration also include intensity information resulting initial segmentation. method works well across many different data-sets, involving real synthetic examples regularly non-regularly sampled data, fast enough may used robot navigation tasks path detection obstacle avoidance.",4 "principal motion components gesture recognition using single-example. paper introduces principal motion components (pmc), new method one-shot gesture recognition. considered scenario single training-video available gesture recognized, limits application traditional techniques (e.g., hmms). pmc, 2d map motion energy obtained per pair consecutive frames video. motion maps associated video processed obtain pca model, used recognition reconstruction-error approach. main benefits proposed approach simplicity, easiness implementation, competitive performance efficiency. report experimental results one-shot gesture recognition using chalearn gesture dataset; benchmark comprising 50,000 gestures, recorded rgb depth video kinect camera. results obtained pmc competitive alternative methods proposed data set.",4 "network motifs analysis croatian literature. paper analyse network motifs co-occurrence directed networks constructed five different texts (four books one portal) croatian language. preparing data network construction, perform network motif analysis. analyse motif frequencies z-scores five networks. present triad significance profile five datasets. furthermore, compare results existing results linguistic networks. firstly, show triad significance profile croatian language similar languages networks belong family networks. however, certain differences croatian language analysed languages. conclude due free word-order croatian language.",4 "distributed newton methods deep neural networks. deep learning involves difficult non-convex optimization problem large number weights two adjacent layers deep structure. handle large data sets complicated networks, distributed training needed, calculation function, gradient, hessian expensive. particular, communication synchronization cost may become bottleneck. paper, focus situations model distributedly stored, propose novel distributed newton method training deep neural networks. variable feature-wise data partitions, careful designs, able explicitly use jacobian matrix matrix-vector products newton method. techniques incorporated reduce running time well memory consumption. first, reduce communication cost, propose diagonalization method approximate newton direction obtained without communication machines. second, consider subsampled gauss-newton matrices reducing running time well communication cost. third, reduce synchronization cost, terminate process finding approximate newton direction even though nodes finished tasks. details implementation issues distributed environments thoroughly investigated. experiments demonstrate proposed method effective distributed training deep neural networks. compared stochastic gradient methods, robust may give better test accuracy.",19 "counterfactual learning machine translation: degeneracies solutions. counterfactual learning natural scenario improve web-based machine translation services offline learning feedback logged user interactions. order avoid risk showing inferior translations users, scenarios mostly exploration-free deterministic logging policies place. analyze possible degeneracies inverse reweighted propensity scoring estimators, stochastic deterministic settings, relate recently proposed techniques counterfactual learning deterministic logging.",19 "detecting volcano deformation insar using deep learning. globally 800 million people live within 100 km volcano currently 1500 volcanoes considered active, half ground-based monitoring. alternatively, satellite radar (insar) employed observe volcanic ground deformation, shown significant statistical link eruptions. modern satellites provide large coverage high resolution signals, leading huge amounts data. explosion data brought major challenges associated timely dissemination information distinguishing volcano deformation patterns noise, currently relies manual inspection. moreover, volcano observatories still lack expertise exploit satellite datasets, particularly developing countries. paper presents novel approach detect volcanic ground deformation automatically wrapped-phase insar images. convolutional neural networks (cnn) employed detect unusual patterns within radar data.",4 utilization deep reinforcement learning saccadic-based object visual search. paper focuses problem learning saccades enabling visual object search. developed system combines reinforcement learning neural network learning predict possible outcomes actions. validated solution three types environment consisting (pseudo)-randomly generated matrices digits. experimental verification followed discussion regarding elements required systems mimicking fovea movement possible research directions.,4 "greedy algorithm cluster specialists. several recent deep neural networks experiments leverage generalist-specialist paradigm classification. however, formal study compared performance different clustering algorithms class assignment. paper perform study, suggest slight modifications clustering procedures, propose novel algorithm designed optimize performance specialist-generalist classification system. experiments cifar-10 cifar-100 datasets allow us investigate situations varying number classes similar data. find \emph{greedy pairs} clustering algorithm consistently outperforms alternatives, choice confusion matrix little impact final performance.",4 "information revolution. world passing major revolution called information revolution, information knowledge becoming available people unprecedented amounts wherever whenever need it. societies fail take advantage new technology left behind, like industrial revolution. information revolution based two major technologies: computers communication. technologies delivered cost effective manner, languages accessible people. one way deliver cost effective manner make suitable technology choices (discussed later), allow people access shared resources. could done throuch street corner shops (for computer usage, e-mail etc.), schools, community centers local library centres.",4 model-based bayesian exploration. reinforcement learning systems often concerned balancing exploration untested actions exploitation actions known good. benefit exploration estimated using classical notion value information - expected improvement future decision quality arising information acquired exploration. estimating quantity requires assessment agent's uncertainty current value estimates states. paper investigate ways representing reasoning uncertainty algorithms system attempts learn model environment. explicitly represent uncertainty parameters model build probability distributions q-values based these. distributions used compute myopic approximation value information action hence select action best balances exploration exploitation.,4 "retrofitting word vectors semantic lexicons. vector space word representations learned distributional information words large corpora. although statistics semantically informative, disregard valuable information contained semantic lexicons wordnet, framenet, paraphrase database. paper proposes method refining vector space representations using relational information semantic lexicons encouraging linked words similar vector representations, makes assumptions input vectors constructed. evaluated battery standard lexical semantic evaluation tasks several languages, obtain substantial improvements starting variety word vector models. refinement method outperforms prior techniques incorporating semantic lexicons word vector training algorithms.",4 "optimization evolutionary neural networks using hybrid learning algorithms. evolutionary artificial neural networks (eanns) refer special class artificial neural networks (anns) evolution another fundamental form adaptation addition learning. evolutionary algorithms used adapt connection weights, network architecture learning algorithms according problem environment. even though evolutionary algorithms well known efficient global search algorithms, often miss best local solutions complex solution space. paper, propose hybrid meta-heuristic learning approach combining evolutionary learning local search methods (using 1st 2nd order error information) improve learning faster convergence obtained using direct evolutionary approach. proposed technique tested three different chaotic time series test results compared popular neuro-fuzzy systems recently developed cutting angle method global optimization. empirical results reveal proposed technique efficient spite computational complexity.",4 "learning image conditioned label space multilabel classification. work addresses task multilabel image classification. inspired great success deep convolutional neural networks (cnns) single-label visual-semantic embedding, exploit extending models multilabel images. specifically, propose image-dependent ranking model, returns ranked list labels according relevance input image. contrast conventional cnn models learn image representation (i.e. image embedding vector), developed model learns mapping (i.e. transformation matrix) image attempt differentiate relevant irrelevant labels. despite conceptual simplicity approach, experimental results public benchmark dataset demonstrate proposed model achieves state-of-the-art performance using fewer training images multilabel classification methods.",4 "semi-supervised clustering methods. cluster analysis methods seek partition data set homogeneous subgroups. useful wide variety applications, including document processing modern genetics. conventional clustering methods unsupervised, meaning outcome variable anything known relationship observations data set. many situations, however, information clusters available addition values features. example, cluster labels observations may known, certain observations may known belong cluster. cases, one may wish identify clusters associated particular outcome variable. review describes several clustering algorithms (known ""semi-supervised clustering"" methods) applied situations. majority methods modifications popular k-means clustering method, several described detail. brief description semi-supervised clustering algorithms also provided.",19 "egocentric look video photographer identity. egocentric cameras worn increasing number users, among many security forces worldwide. gopro cameras already penetrated mass market, reporting substantial increase sales every year. head-worn cameras capture photographer, may seem anonymity photographer preserved even video publicly distributed. show camera motion, computed egocentric video, provides unique identity information. photographer reliably recognized seconds video captured walking. proposed method achieves 90% recognition accuracy cases random success rate 3%. applications include theft prevention locking camera worn rightful owner. searching video sharing services (e.g. youtube) egocentric videos shot specific photographer may also become possible. important message paper photographers aware sharing egocentric video compromise anonymity, even face visible.",4 "predicting preference flips commerce search. traditional approaches ranking web search follow paradigm rank-by-score: learned function gives query-url combination absolute score urls ranked according score. paradigm ensures score one url better another one always ranked higher other. scoring contradicts prior work behavioral economics showed users' preferences two items depend items also presented alternatives. thus, query, users' preference items b depends presence/absence item c. propose new model ranking, random shopper model, allows explains behavior. model, feature viewed markov chain items ranked, goal find weighting features best reflects importance. show model learned empirical risk minimization framework, give efficient learning algorithm. experiments commerce search logs demonstrate algorithm outperforms scoring-based approaches including regression listwise ranking.",4 "texture segmentation based video compression using convolutional neural networks. growing interest using different approaches improve coding efficiency modern video codec recent years demand web-based video consumption increases. paper, propose model-based approach uses texture analysis/synthesis reconstruct blocks texture regions video achieve potential coding gains using av1 codec developed alliance open media (aom). proposed method uses convolutional neural networks extract texture regions frame, reconstructed using global motion model. preliminary results show increase coding efficiency maintaining satisfactory visual quality.",4 "demon: depth motion network learning monocular stereo. paper formulate structure motion learning problem. train convolutional network end-to-end compute depth camera motion successive, unconstrained image pairs. architecture composed multiple stacked encoder-decoder networks, core part iterative network able improve predictions. network estimates depth motion, additionally surface normals, optical flow images confidence matching. crucial component approach training loss based spatial relative differences. compared traditional two-frame structure motion methods, results accurate robust. contrast popular depth-from-single-image networks, demon learns concept matching and, thus, better generalizes structures seen training.",4 "inferring gene regulatory network using evolutionary multi-objective method. inference gene regulatory networks (grns) based experimental data challenging task bioinformatics. paper, present bi-objective minimization model (bomm) inference grns, one objective fitting error derivatives, number connections network. solve bomm efficiently, propose multi-objective evolutionary algorithm (moea), utilize separable parameter estimation method (spem) decoupling ordinary differential equation (ode) system. then, akaike information criterion (aic) employed select one inference result obtained pareto set. taking s-system investigated grn model, method properly identify topologies parameter values benchmark systems. need preset problem-dependent parameter values obtain appropriate results, thus, method could applicable inference various grns models.",4 "graph-sparse logistic regression. introduce graph-sparse logistic regression, new algorithm classification case support sparse connected graph. val- idate algorithm synthetic data benchmark l1-regularized logistic regression. explore technique bioinformatics context proteomics data interactome graph. make experimental code public provide gslr open source package.",4 "compression deep neural networks image instance retrieval. image instance retrieval problem retrieving images database contain object. convolutional neural network (cnn) based descriptors becoming dominant approach generating {\it global image descriptors} instance retrieval problem. one major drawback cnn-based {\it global descriptors} uncompressed deep neural network models require hundreds megabytes storage making inconvenient deploy mobile applications custom hardware. work, study problem neural network model compression focusing image instance retrieval task. study quantization, coding, pruning weight sharing techniques reducing model size instance retrieval problem. provide extensive experimental results trade-off retrieval performance model size different types networks several data sets providing comprehensive study topic. compress models order mbs: two orders magnitude smaller uncompressed models achieving negligible loss retrieval performance.",4 "using differential evolution graph coloring. differential evolution developed reliable versatile function optimization. also become interesting domains ease use. paper, posed question whether differential evolution also used solving combinatorial optimization problems, particular, graph coloring problem. therefore, hybrid self-adaptive differential evolution algorithm graph coloring proposed comparable best heuristics graph coloring today, i.e. tabucol hertz de werra hybrid evolutionary algorithm galinier hao. focused graph 3-coloring. therefore, evolutionary algorithm method saw eiben et al., achieved excellent results kind graphs, also incorporated study. extensive experiments show differential evolution could become competitive tool solving graph coloring problem future.",12 binary classification based potentials. introduce simple computationally trivial method binary classification based evaluation potential functions. demonstrate despite conceptual computational simplicity method performance match exceed standard support vector machine methods.,4 "optimal ordered problem solver. present novel, general, optimally fast, incremental way searching universal algorithm solves task sequence tasks. optimal ordered problem solver (oops) continually organizes exploits previously found solutions earlier tasks, efficiently searching space domain-specific algorithms, also space search algorithms. essentially extend principles optimal nonincremental universal search build incremental universal learner able improve experience. illustrative experiments, self-improver becomes first general system learns solve n disk towers hanoi tasks (solution size 2^n-1) n 30, profiting previously solved, simpler tasks involving samples simple context free language.",4 "distributed representation subgraphs. network embeddings become popular learning effective feature representations networks. motivated recent successes embeddings natural language processing, researchers tried find network embeddings order exploit machine learning algorithms mining tasks like node classification edge prediction. however, work focuses finding distributed representations nodes, inherently ill-suited tasks community detection intuitively dependent subgraphs. here, propose sub2vec, unsupervised scalable algorithm learn feature representations arbitrary subgraphs. provide means characterize similarties subgraphs provide theoretical analysis sub2vec demonstrate preserves so-called local proximity. also highlight usability sub2vec leveraging network mining tasks, like community detection. show sub2vec gets significant gains state-of-the-art methods node-embedding methods. particular, sub2vec offers approach generate richer vocabulary features subgraphs support representation reasoning.",4 "understanding trajectory behavior: motion pattern approach. mining underlying patterns gigantic complex data great importance data analysts. paper, propose motion pattern approach mine frequent behaviors trajectory data. motion patterns, defined set highly similar flow vector groups spatial locality, shown effective extracting dominant motion behaviors video sequences. inspired applications properties motion patterns, designed framework successfully solves general task trajectory clustering. proposed algorithm consists four phases: flow vector computation, motion component extraction, motion component's reachability set creation, motion pattern formation. first phase, break trajectories flow vectors indicate instantaneous movements. second phase, via kmeans clustering approach, create motion components clustering flow vectors respect location velocity. next, create motion components' reachability set terms spatial proximity motion similarity. finally, fourth phase, cluster motion components using agglomerative clustering weighted jaccard distance motion components' signatures, set created using path reachability. evaluated effectiveness proposed method extensive set experiments diverse datasets. further, shown proposed method handles difficulties general task trajectory clustering challenge existing state-of-the-art methods.",4 "riemannian network spd matrix learning. symmetric positive definite (spd) matrix learning methods become popular many image video processing tasks, thanks ability learn appropriate statistical representations respecting riemannian geometry underlying spd manifolds. paper build riemannian network architecture open new direction spd matrix non-linear learning deep model. particular, devise bilinear mapping layers transform input spd matrices desirable spd matrices, exploit eigenvalue rectification layers apply non-linear activation function new spd matrices, design eigenvalue logarithm layer perform riemannian computing resulting spd matrices regular output layers. training proposed deep network, exploit new backpropagation variant stochastic gradient descent stiefel manifolds update structured connection weights involved spd matrix data. show experiments proposed spd matrix network simply trained outperform existing spd matrix learning state-of-the-art methods three typical visual classification tasks.",4 "compact representations finite-state transducers. finite-state transducers give efficient representations many natural language phenomena. allow account complex lexicon restrictions encountered, without involving use large set complex rules difficult analyze. show representations made compact, indicate perform corresponding minimization, point interesting linguistic side-effects operation.",2 "muse: modularizing unsupervised sense embeddings. paper proposes address word sense ambiguity issue unsupervised manner, word sense representations learned along word sense selection mechanism given contexts. prior work learning multi-sense embeddings suffered either ambiguity different-level embeddings inefficient sense selection. proposed modular framework, muse, implements flexible modules optimize distinct mechanisms, achieving first purely sense-level representation learning system linear-time sense selection. leverage reinforcement learning enable joint training proposed modules, introduce various exploration techniques sense selection better robustness. experiments benchmark data show proposed approach achieves state-of-the-art performance synonym selection well contextual word similarities terms maxsimc.",4 "improving video generation multi-functional applications. paper, aim improve state-of-the-art video generative adversarial networks (gans) view towards multi-functional applications. improved video gan model separate foreground background dynamic static patterns, learns generate entire video clip conjointly. model thus trained generate - learn - broad set videos restriction. achieved designing robust one-stream video generation architecture extension state-of-the-art wasserstein gan framework allows better convergence. experimental results show improved video gan model outperforms state-of-theart video generative models multiple challenging datasets. furthermore, demonstrate superiority model successfully extending three challenging problems: video colorization, video inpainting, future prediction. best knowledge, first work using gans colorize inpaint video clips.",4 "variational relevance vector machines. support vector machine (svm) vapnik (1998) become widely established one leading approaches pattern recognition machine learning. expresses predictions terms linear combination kernel functions centred subset training data, known support vectors. despite widespread success, svm suffers important limitations, one significant makes point predictions rather generating predictive distributions. recently tipping (1999) formulated relevance vector machine (rvm), probabilistic model whose functional form equivalent svm. achieves comparable recognition accuracy svm, yet provides full predictive distribution, also requires substantially fewer kernel functions. original treatment rvm relied use type ii maximum likelihood (the `evidence framework') provide point estimates hyperparameters govern model sparsity. paper show rvm formulated solved within completely bayesian paradigm use variational inference, thereby giving posterior distribution parameters hyperparameters. demonstrate practicality performance variational rvm using synthetic real world examples.",4 "highway residual networks learn unrolled iterative estimation. past year saw introduction new architectures highway networks residual networks which, first time, enabled training feedforward networks dozens hundreds layers using simple gradient descent. depth representation posited primary reason success, indications architectures defy popular view deep learning hierarchical computation increasingly abstract features layer. report, argue view incomplete adequately explain several recent findings. propose alternative viewpoint based unrolled iterative estimation -- group successive layers iteratively refine estimates features instead computing entirely new representation. demonstrate viewpoint directly leads construction highway residual networks. finally provide preliminary experiments discuss similarities differences two architectures.",4 "structured-based curriculum learning end-to-end english-japanese speech translation. sequence-to-sequence attentional-based neural network architectures shown provide powerful model machine translation speech recognition. recently, several works attempted extend models end-to-end speech translation task. however, usefulness models investigated language pairs similar syntax word order (e.g., english-french english-spanish). work, focus end-to-end speech translation tasks syntactically distant language pairs (e.g., english-japanese) require distant word reordering. guide encoder-decoder attentional model learn difficult problem, propose structured-based curriculum learning strategy. unlike conventional curriculum learning gradually emphasizes difficult data examples, formalize learning strategies easier network structures difficult network structures. here, start training end-to-end encoder-decoder speech recognition text-based machine translation task gradually move end-to-end speech translation task. experiment results show proposed approach could provide significant improvements comparison one without curriculum learning.",4 "principal polynomial analysis. paper presents new framework manifold learning based sequence principal polynomials capture possibly nonlinear nature data. proposed principal polynomial analysis (ppa) generalizes pca modeling directions maximal variance means curves, instead straight lines. contrarily previous approaches, ppa reduces performing simple univariate regressions, makes computationally feasible robust. moreover, ppa shows number interesting analytical properties. first, ppa volume-preserving map, turn guarantees existence inverse. second, inverse obtained closed form. invertibility important advantage learning methods, permits understand identified features input domain data physical meaning. moreover, allows evaluate performance dimensionality reduction sensible (input-domain) units. volume preservation also allows easy computation information theoretic quantities, reduction multi-information transform. third, analytical nature ppa leads clear geometrical interpretation manifold: allows computation frenet-serret frames (local features) generalized curvatures point space. fourth, analytical jacobian allows computation metric induced data, thus generalizing mahalanobis distance. properties demonstrated theoretically illustrated experimentally. performance ppa evaluated dimensionality redundancy reduction, synthetic real datasets uci repository.",19 "towards case-based preference elicitation: similarity measures preference structures. decision theory provides appealing normative framework representing rich preference structures, eliciting utility value functions typically incurs large cost. many applications involving interactive systems overhead precludes use formal decision-theoretic models preference. instead performing elicitation vacuum, would useful could augment directly elicited preferences appropriate default information. paper propose case-based approach alleviating preference elicitation bottleneck. assuming existence population users elicited complete incomplete preference structures, propose eliciting preferences new user interactively incrementally, using closest existing preference structures potential defaults. since notion closeness demands measure distance among preference structures, paper takes first step studying various distance measures fully partially specified preference structures. explore use euclidean distance, spearmans footrule, define new measure, probabilistic distance. provide computational techniques three measures.",4 "qualitative measures ambiguity. paper introduces qualitative measure ambiguity analyses relationship measures uncertainty. probability measures relative likelihoods, ambiguity measures vagueness surrounding judgments. ambiguity important representation uncertain knowledge. deals different, type uncertainty modeled subjective probability belief.",4 "robust multi biometric recognition using face ear images. study investigates use ear biometric authentication shows experimental results obtained newly created dataset 420 images. images passed quality module order reduce false rejection rate. principal component analysis (eigen ear) approach used, obtaining 90.7 percent recognition rate. improvement recognition results obtained ear biometric fused face biometric. fusion done decision level, achieving recognition rate 96 percent.",4 "guided macro-mutation graded energy based genetic algorithm protein structure prediction. protein structure prediction considered one challenging computationally intractable combinatorial problem. thus, efficient modeling convoluted search space, clever use energy functions, importantly, use effective sampling algorithms become crucial address problem. protein structure modeling, off-lattice model provides limited scopes exercise evaluate algorithmic developments due astronomically large set data-points. contrast, on-lattice model widens scopes permits studying relatively larger proteins finite set data-points. work, took full advantage on-lattice model using face-centered-cube lattice highest packing density maximum degree freedom. proposed graded energy-strategically mixes miyazawa-jernigan (mj) energy hydrophobic-polar (hp) energy-based genetic algorithm (ga) conformational search. application, introduced 2x2 hp energy guided macro-mutation operator within ga explore best possible local changes exhaustively. conversely, 20x20 mj energy model-the ultimate objective function ga needs minimized-considers impacts amongst 20 different amino acids allow searching globally acceptable conformations. set benchmark proteins, proposed approach outperformed state-of-the-art approaches terms free energy levels root-mean-square deviations.",4 "metric learning generalizing spatial relations new objects. human-centered environments rich wide variety spatial relations everyday objects. autonomous robots operate effectively environments, able reason relations generalize objects different shapes sizes. example, learned place toy inside basket, robot able generalize concept using spoon cup. requires robot flexibility learn arbitrary relations lifelong manner, making challenging expert pre-program sufficient knowledge beforehand. paper, address problem learning spatial relations introducing novel method perspective distance metric learning. approach enables robot reason similarity pairwise spatial relations, thereby enabling use previous knowledge presented new relation imitate. show makes possible learn arbitrary spatial relations non-expert users using small number examples interactive manner. extensive evaluation real-world data demonstrates effectiveness method reasoning continuous spectrum spatial relations generalizing new objects.",4 "understanding user instructions utilizing open knowledge service robots. understanding user instructions natural language active research topic ai robotics. typically, natural user instructions high-level reduced low-level tasks expressed common verbs (e.g., `take', `get', `put'). robots understanding instructions, one key challenges process high-level user instructions achieve specified tasks robots' primitive actions. address this, propose novel algorithms utilizing semantic roles common verbs defined semantic dictionaries integrating multiple open knowledge generate task plans. specifically, present new method matching recovering semantics user instructions novel task planner exploits functional knowledge robot's action model. verify evaluate approach, implemented prototype system using knowledge several open resources. experiments system confirmed correctness efficiency algorithms. notably, system deployed kejia robot, participated annual robocup@home competitions past three years achieved encouragingly high scores benchmark tests.",4 "speech vocoding laboratory phonology. using phonological speech vocoding, propose platform exploring relations phonology speech processing, broader terms, exploring relations abstract physical structures speech signal. goal make step towards bridging phonology speech processing contribute program laboratory phonology. show three application examples laboratory phonology: compositional phonological speech modelling, comparison phonological systems experimental phonological parametric text-to-speech (tts) system. featural representations following three phonological systems considered work: (i) government phonology (gp), (ii) sound pattern english (spe), (iii) extended spe (espe). comparing gp- espe-based vocoded speech, conclude latter achieves slightly better results former. however, gp - compact phonological speech representation - performs comparably systems higher number phonological features. parametric tts based phonological speech representation, trained unlabelled audiobook unsupervised manner, achieves intelligibility 85% state-of-the-art parametric speech synthesis. envision presented approach paves way researchers fields form meaningful hypotheses explicitly testable using concepts developed exemplified paper. one hand, laboratory phonologists might test applied concepts theoretical models, hand, speech processing community may utilize concepts developed theoretical phonological models improvements current state-of-the-art applications.",4 "noise models feature-based stereo visual odometry. feature-based visual structure motion reconstruction pipelines, common visual odometry large-scale reconstruction photos, use location corresponding features different images determine 3d structure scene, well camera parameters associated image. noise model, defines likelihood location feature image, key factor accuracy pipelines, alongside optimization strategy. many different noise models proposed literature; paper investigate performance several. evaluate models specifically w.r.t. stereo visual odometry, task simple (camera intrinsics constant known; geometry initialized reliably) datasets ground truth readily available (kitti odometry new tsukuba stereo dataset). evaluation shows noise models adaptable varying nature noise generally perform better.",4 "mining object parts cnns via active question-answering. given convolutional neural network (cnn) pre-trained object classification, paper proposes use active question-answering semanticize neural patterns conv-layers cnn mine part concepts. part concept, mine neural patterns pre-trained cnn, related target part, use patterns construct and-or graph (aog) represent four-layer semantic hierarchy part. interpretable model, aog associates different cnn units different explicit object parts. use active human-computer communication incrementally grow aog pre-trained cnn follows. allow computer actively identify objects, whose neural patterns cannot explained current aog. then, computer asks human unexplained objects, uses answers automatically discover certain cnn patterns corresponding missing knowledge. incrementally grow aog encode new knowledge discovered active-learning process. experiments, method exhibits high learning efficiency. method uses 1/6-1/3 part annotations training, achieves similar better part-localization performance fast-rcnn methods.",4 "cforb: circular freak-orb visual odometry. present novel visual odometry algorithm entitled circular freak-orb (cforb). algorithm detects features using well-known orb algorithm [12] computes feature descriptors using freak algorithm [14]. cforb invariant rotation scale changes, suitable use environments uneven terrain. two visual geometric constraints utilized order remove invalid feature descriptor matches. constraints previously utilized visual odometry algorithm. variation circular matching [16] also implemented. allows features matched images without dependent upon epipolar constraint. algorithm run kitti benchmark dataset achieves competitive average translational error $3.73 \%$ average rotational error $0.0107 deg/m$. cforb also run indoor environment achieved average translational error $3.70 \%$. running cforb highly textured environment approximately uniform feature spread across images, algorithm achieves average translational error $2.4 \%$ average rotational error $0.009 deg/m$.",4 "sequence-to-sequence models directly translate foreign speech. present recurrent encoder-decoder deep neural network architecture directly translates speech one language text another. model explicitly transcribe speech text source language, require supervision ground truth source language transcription training. apply slightly modified sequence-to-sequence attention architecture previously used speech recognition show repurposed complex task, illustrating power attention-based models. single model trained end-to-end obtains state-of-the-art performance fisher callhome spanish-english speech translation task, outperforming cascade independently trained sequence-to-sequence speech recognition machine translation models 1.8 bleu points fisher test set. addition, find making use training data languages multi-task training sequence-to-sequence speech translation recognition models shared encoder network improve performance 1.4 bleu points.",4 "variational intrinsic control. paper introduce new unsupervised reinforcement learning method discovering set intrinsic options available agent. set learned maximizing number different states agent reliably reach, measured mutual information set options option termination states. end, instantiate two policy gradient based algorithms, one creates explicit embedding space options one represents options implicitly. algorithms also provide explicit measure empowerment given state used empowerment maximizing agent. algorithm scales well function approximation demonstrate applicability algorithm range tasks.",4 "well-founded argumentation semantics extended logic programming. paper defines argumentation semantics extended logic programming shows equivalence well-founded semantics explicit negation. set general framework extensively compare semantics argumentation semantics, including dung, prakken sartor. present general dialectical proof theory argumentation semantics.",4 deep neural networks - brief history. introduction deep neural networks history.,4 "sppam - statistical preprocessing algorithm. machine learning tools work single table row instance column attribute. cell table contains attribute value instance. representation prevents one important form learning, is, classification based groups correlated records, multiple exams single patient, internet customer preferences, weather forecast prediction sea conditions given day. extent, relational learning methods, inductive logic programming, capture correlation use intensional predicates added background knowledge. work, propose sppam, algorithm aggregates past observations one single record. show applying sppam original correlated data, learning task, produce classifiers better ones trained using records.",4 "hybrid data clustering approach using k-means flower pollination algorithm. data clustering technique clustering set objects known number groups. several approaches widely applied data clustering objects within clusters similar objects different clusters far away other. k-means, one familiar center based clustering algorithms since implementation easy fast convergence. however, k-means algorithm suffers initialization, hence trapped local optima. flower pollination algorithm (fpa) global optimization technique, avoids trapping local optimum solution. paper, novel hybrid data clustering approach using flower pollination algorithm k-means (fpakm) proposed. proposed algorithm results compared k-means fpa eight datasets. experimental results, fpakm better fpa k-means.",4 "long-term ensemble learning visual place classifiers. paper addresses problem cross-season visual place classification (vpc) novel perspective long-term map learning. goal enable transfer learning efficiently one season next, small constant cost, without wasting robot's available long-term-memory memorizing large amounts training data. realize good tradeoff generalization specialization abilities, employ ensemble convolutional neural network (dcn) classifiers consider task scheduling (when classifiers retrain), given previous season's dcn classifiers sole prior knowledge. present unified framework retraining scheduling discuss practical implementation strategies. furthermore, address task partitioning robot's workspace places define place classes unsupervised manner, rather using uniform partitioning, maximize vpc performance. experiments using publicly available nclt dataset revealed retraining scheduling dcn classifier ensemble crucial performance significantly increased using planned scheduling.",4 "deep retinal image understanding. paper presents deep retinal image understanding (driu), unified framework retinal image analysis provides retinal vessel optic disc segmentation. make use deep convolutional neural networks (cnns), proven revolutionary fields computer vision object detection image classification, bring power study eye fundus images. driu uses base network architecture two set specialized layers trained solve retinal vessel optic disc segmentation. present experimental validation, qualitative quantitative, four public datasets tasks. them, driu presents super-human performance, is, shows results consistent gold standard second human annotator used control.",4 "online group feature selection. online feature selection dynamic features become active research area recent years. however, real-world applications image analysis email spam filtering, features may arrive groups. existing online feature selection methods evaluate features individually, existing group feature selection methods cannot handle online processing. motivated this, formulate online group feature selection problem, propose novel selection approach problem. proposed approach consists two stages: online intra-group selection online inter-group selection. intra-group selection, use spectral analysis select discriminative features group arrives. inter-group selection, use lasso select globally optimal subset features. 2-stage procedure continues features come predefined stopping conditions met. extensive experiments conducted benchmark real-world data sets demonstrate proposed approach outperforms state-of-the-art online feature selection methods.",4 "classification low rank missing data. consider classification regression tasks missing data assume (clean) data resides low rank subspace. finding hidden subspace known computationally hard. nevertheless, using non-proper formulation give efficient agnostic algorithm classifies good best linear classifier coupled best low-dimensional subspace data resides. direct implication algorithm linearly (and non-linearly kernels) classify provably well best classifier access full data.",4 "novel approach detecting pose orientation 3d face required face. paper present novel approach takes input 3d image gives output pose i.e. tells whether face oriented respect x, z axes angles rotation 40 degree. experiments performed frav3d database. applying proposed algorithm 3d facial surface obtained i.e. 848 3d face images method detected pose correctly 566 face images,thus giving approximately 67 % correct pose detection.",4 "leveraging subjective human annotation clustering historic newspaper articles. new york public library participating chronicling america initiative develop online searchable database historically significant newspaper articles. microfilm copies newspapers scanned high resolution optical character recognition (ocr) software run them. text ocr provides wealth data opinion researchers historians. however, categorization articles provided ocr engine rudimentary large number articles labeled editorial without grouping. manually sorting articles fine-grained categories time consuming impossible given size corpus. paper studies techniques automatic categorization newspaper articles enhance search retrieval archive. explore unsupervised (e.g. kmeans) semi-supervised (e.g. constrained clustering) learning algorithms develop article categorization schemes geared towards needs end-users. pilot study designed understand whether unanimous agreement amongst patrons regarding articles categorized. found task subjective consequently automated algorithms could deal subjective labels used. small scale pilot study extremely helpful designing machine learning algorithms, much larger system needs developed collect annotations users archive. ""bodhi"" system currently developed step direction, allowing users correct wrongly scanned ocr providing keywords tags newspaper articles used frequently. successful implementation beta version system, hope integrated existing software developed chronicling america project.",4 "diffusion fingerprints. introduce, test discuss method classifying clustering data modeled directed graphs. idea start diffusion processes subset data collection, generating corresponding distributions reaching points network. distributions take form high-dimensional numerical vectors capture essential topological properties original dataset. show diffusion vectors successfully applied getting state-of-the-art accuracies problem extracting pathways metabolic networks. also provide guideline illustrate use method classification problems, discuss important details implementation. particular, present simple dimensionality reduction technique lowers computational cost classifying diffusion vectors, leaving predictive power classification process substantially unaltered. although method parameters, results obtain show flexibility power. make helpful many contexts.",19 "postponing branching decisions. solution techniques constraint satisfaction optimisation problems often make use backtrack search methods, exploiting variable value ordering heuristics. paper, propose analyse simple method apply case value ordering heuristic produces ties: postponing branching decision. end, group together values tie, branch sub-domain, defer decision among lower levels search tree. show theoretically experimentally simple modification dramatically improve efficiency search strategy. although practise similar methods may applied already, knowledge, empirical theoretical study proposed literature identify extent strategy used.",4 "fuzzy fibers: uncertainty dmri tractography. fiber tracking based diffusion weighted magnetic resonance imaging (dmri) allows noninvasive reconstruction fiber bundles human brain. chapter, discuss sources error uncertainty technique, review strategies afford reliable interpretation results. includes methods computing rendering probabilistic tractograms, estimate precision face measurement noise artifacts. however, also address aspects received less attention far, model selection, partial voluming, impact parameters, preprocessing fiber tracking itself. conclude giving impulses future research.",4 "deep optimization spectrum repacking. 13 months 2016-17 fcc conducted ""incentive auction"" repurpose radio spectrum broadcast television wireless internet. end, auction yielded $19.8 billion, $10.05 billion paid 175 broadcasters voluntarily relinquishing licenses across 14 uhf channels. stations continued broadcasting assigned potentially new channels fit densely possible channels remained. government netted $7 billion (used pay national debt) covering costs. crucial element auction design construction solver, dubbed satfc, determined whether sets stations could ""repacked"" way; needed run every time station given price quote. paper describes process built satfc. adopted approach dub ""deep optimization"", taking data-driven, highly parametric, computationally intensive approach solver design. specifically, build satfc designed software could pair complete local-search sat-encoded feasibility checking wide range domain-specific techniques. used automatic algorithm configuration techniques construct portfolio eight complementary algorithms run parallel, aiming achieve good performance instances arose proprietary auction simulations. evaluate impact solver paper, built open-source reverse auction simulator. found within short time budget required practice, satfc solved 95% problems encountered. furthermore, incentive auction paired satfc produced nearly optimal allocations restricted setting substantially outperformed alternatives national scale.",4 "robust bayesian optimization student-t likelihood. bayesian optimization recently attracted attention automatic machine learning community excellent results hyperparameter tuning. bo characterized sample efficiency optimize expensive black-box functions. efficiency achieved similar fashion learning learn methods: surrogate models (typically form gaussian processes) learn target function perform intelligent sampling. surrogate model applied even presence noise; however, regression methods, sensitive outlier data. result erroneous predictions and, case bo, biased inefficient exploration. work, present gp model robust outliers uses student-t likelihood segregate outliers robustly conduct bayesian optimization. present numerical results evaluating proposed method artificial functions real problems.",4 "aggregated sparse attention steering angle prediction. paper, apply attention mechanism autonomous driving steering angle prediction. propose first model, applying recently introduced sparse attention mechanism visual domain, well aggregated extension model. show improvement proposed method, comparing attention well different types attention.",4 "metric learning across heterogeneous domains respectively aligning priors posteriors. paper, attempts learn single metric across two heterogeneous domains source domain fully labeled many samples target domain labeled samples abundant unlabeled samples. best knowledge, task seldom touched. proposed learning model simple underlying motivation: samples source target domains mapped common space, priors p(sample)s posteriors p(label|sample)s forced respectively aligned much possible. show two mappings, source domain target domain common space, reparameterized single positive semi-definite(psd) matrix. develop efficient bregman projection algorithm optimize pds matrix logdet function used regularize. furthermore, also show model easily kernelized verify effectiveness crosslanguage retrieval task cross-domain object recognition task.",4 "knapsack constrained contextual submodular list prediction application multi-document summarization. study problem predicting set list options knapsack constraint. quality lists evaluated submodular reward function measures quality diversity. similar dagger (ross et al., 2010), reduction online learning, show adapt two sequence prediction models imitate greedy maximization knapsack constraint problems: conseqopt (dey et al., 2012) scp (ross et al., 2013). experiments extractive multi-document summarization show approach outperforms existing state-of-the-art methods.",4 "learning latent language. named concepts compositional operators present natural language provide rich source information kinds abstractions humans use navigate world. linguistic background knowledge improve generality efficiency learned classifiers control policies? paper aims show using space natural language strings parameter space effective way capture natural task structure. pretraining phase, learn language interpretation model transforms inputs (e.g. images) outputs (e.g. labels) given natural language descriptions. learn new concept (e.g. classifier), search directly space descriptions minimize interpreter's loss training examples. crucially, models require language data learn concepts: language used pretraining impose structure subsequent learning. results image classification, text editing, reinforcement learning show that, settings, models linguistic parameterization outperform without.",4 "feature discovery visualization robot mission data using convolutional autoencoders bayesian nonparametric topic models. gap ability collect interesting data ability analyze data growing unprecedented rate. recent algorithmic attempts fill gap employed unsupervised tools discover structure data. successful approaches used probabilistic models uncover latent thematic structure discrete data. despite success models textual data, generalized well image data, part spatial temporal structure may exist image stream. introduce novel unsupervised machine learning framework incorporates ability convolutional autoencoders discover features images directly encode spatial information, within bayesian nonparametric topic model discovers meaningful latent patterns within discrete data. using hybrid framework, overcome fundamental dependency traditional topic models rigidly hand-coded data representations, simultaneously encoding spatial dependency topics without adding model complexity. apply model motivating application high-level scene understanding mission summarization exploratory marine robots. experiments seafloor dataset collected marine robot show proposed hybrid framework outperforms current state-of-the-art approaches task unsupervised seafloor terrain characterization.",4 "systematic evaluation cnn advances imagenet. paper systematically studies impact range recent advances cnn architectures learning methods object categorization (ilsvrc) problem. evalution tests influence following choices architecture: non-linearity (relu, elu, maxout, compatibility batch normalization), pooling variants (stochastic, max, average, mixed), network width, classifier design (convolutional, fully-connected, spp), image pre-processing, learning parameters: learning rate, batch size, cleanliness data, etc. performance gains proposed modifications first tested individually combination. sum individual gains bigger observed improvement modifications introduced, ""deficit"" small suggesting independence benefits. show use 128x128 pixel images sufficient make qualitative conclusions optimal network structure hold full size caffe vgg nets. results obtained order magnitude faster standard 224 pixel images.",4 "classification framework partially observed dynamical systems. present general framework classifying partially observed dynamical systems based idea learning model space. contrast existing approaches using model point estimates represent individual data items, employ posterior distributions models, thus taking account principled manner uncertainty due generative (observational and/or dynamic noise) observation (sampling time) processes. evaluate framework two testbeds - biological pathway model stochastic double-well system. crucially, show classifier performance impaired model class used inferring posterior distributions much simple observation-generating model class, provided reduced complexity inferential model class captures essential characteristics needed given classification task.",19 "learning population subject-specific brain connectivity networks via mixed neighborhood selection. neuroimaging data analysis, gaussian graphical models often used model statistical dependencies across spatially remote brain regions known functional connectivity. typically, data collected across cohort subjects scientific objectives consist estimating population subject-specific graphical models. third objective often overlooked involves quantifying inter-subject variability thus identifying regions sub-networks demonstrate heterogeneity across subjects. information fundamental order thoroughly understand human connectome. propose mixed neighborhood selection order simultaneously address three aforementioned objectives. recasting covariance selection neighborhood selection problem able efficiently learn topology node. introduce additional mixed effect component neighborhood selection order simultaneously estimate graphical model population subjects well individual subject. proposed method validated empirically series simulations applied resting state data healthy subjects taken abide consortium.",19 "large batch training convolutional networks. common way speed training large convolutional networks add computational units. training performed using data-parallel synchronous stochastic gradient descent (sgd) mini-batch divided computational units. increase number nodes, batch size grows. training large batch size often results lower model accuracy. argue current recipe large batch training (linear learning rate scaling warm-up) general enough training may diverge. overcome optimization difficulties propose new training algorithm based layer-wise adaptive rate scaling (lars). using lars, scaled alexnet batch size 8k, resnet-50 batch size 32k without loss accuracy.",4 "electronic dictionary basis nlp tools: greek case. existence dictionary electronic form modern greek (mg) mandatory one process mg morphological syntactic levels since mg highly inflectional language marked stress spelling system many characteristics carried ancient greek. moreover, tool becomes necessary one create efficient sophisticated nlp applications substantial linguistic backing coverage. present paper focus deployment electronic dictionary modern greek, built two phases: first constructed basis spelling correction schema reconstructed order become platform deployment wider spectrum nlp tools.",4 "improving multiple object tracking optical flow edge preprocessing. paper, present new method detecting road users urban environment leads improvement multiple object tracking. method takes input foreground image improves object detection segmentation. new image used input trackers use foreground blobs background subtraction. first step create foreground images frames urban video. then, starting original blobs foreground image, merge blobs close one another similar optical flow. next step extracting edges different objects detect multiple objects might close (and merged blob) adjust size original blobs. time, use optical flow detect occlusion objects moving opposite directions. finally, make decision information keep order construct new foreground image blobs used tracking. system validated four videos urban traffic dataset. method improves recall precision metrics object detection task compared vanilla background subtraction method improves clear mot metrics tracking tasks videos.",4 "optimizing monotone functions difficult. extending previous analyses function classes like linear functions, analyze simple (1+1) evolutionary algorithm optimizes pseudo-boolean functions strictly monotone. contrary one would expect, functions easy optimize. choice constant $c$ mutation probability $p(n) = c/n$ make decisive difference. show $c < 1$, (1+1) evolutionary algorithm finds optimum every function $\theta(n \log n)$ iterations. $c=1$, still prove upper bound $o(n^{3/2})$. however, $c > 33$, present strictly monotone function (1+1) evolutionary algorithm overwhelming probability find optimum within $2^{\omega(n)}$ iterations. first time observe constant factor change mutation probability changes run-time constant factors.",4 "fine-grained recognition datasets biodiversity analysis. following paper, present discuss challenging applications fine-grained visual classification (fgvc): biodiversity species analysis. give details two challenging new datasets suitable computer vision research 675 highly similar classes, also present first results localized features using convolutional neural networks (cnn). conclude list challenging new research directions area visual classification biodiversity research.",4 "leveraging power gabor phase face identification: block matching approach. different face verification, face identification much demanding. reach comparable performance, identifier needs roughly n times better verifier. expect breakthrough face identification, need fresh look fundamental building blocks face recognition. paper focus selection suitable signal representation better matching strategy face identification. demonstrate gabor phase could leveraged improve performance face identification using block matching method. compared existing approaches, proposed method features much lower algorithmic complexity: face images filtered single-scale gabor filter pair matching performed pairs face images hand without involving training process. benchmark evaluations show proposed approach totally comparable even better state-of-the-art algorithms, typically based features extracted large set gabor faces and/or rely heavy training processes.",4 "exploring new directions iris recognition. new approach iris recognition based circular fuzzy iris segmentation (cfis) gabor analytic iris texture binary encoder (gaitbe) proposed tested here. cfis procedure designed guarantee similar iris segments obtained similar eye images, despite fact degree occlusion may vary one image another. result circular iris ring (concentric pupil) approximates actual iris. gaitbe proves better encoding statistical independence iris codes extracted different irides using hilbert transform. irides university bath iris database binary encoded two different lengths (768 / 192 bytes) tested single-enrollment multi-enrollment identification scenarios. cases illustrate capacity newly proposed methodology narrow distribution inter-class matching scores, consequently, guarantee steeper descent false accept rate.",4 "learning, generalization, functional entropy random automata networks. shown \citep{broeck90:physicalreview,patarnello87:europhys} feedforward boolean networks learn perform specific simple tasks generalize well subset learning examples provided learning. here, extend body work show experimentally random boolean networks (rbns), interconnections boolean transfer functions chosen random initially, evolved using state-topology evolution solve simple tasks. measure learning generalization performance, investigate influence average node connectivity $k$, system size $n$, introduce new measure allows better describe network's learning generalization behavior. show connectivity maximum entropy networks scales power-law system size $n$. results show networks higher average connectivity $k$ (supercritical) achieve higher memorization partial generalization. however, near critical connectivity, networks show higher perfect generalization even-odd task.",4 "polyharmonic daubechies type wavelets image processing astronomy, ii. consider application polyharmonic subdivision wavelets (of daubechies type) image processing, particular astronomical images. results show essential advantage standard multivariate wavelets potential better compression.",12 "graph-cut ransac. novel method robust estimation, called graph-cut ransac, gc-ransac short, introduced. separate inliers outliers, runs graph-cut algorithm local optimization (lo) step applied so-far-the-best model found. proposed lo step conceptually simple, easy implement, globally optimal efficient. gc-ransac shown experimentally, synthesized tests real image pairs, geometrically accurate state-of-the-art methods range problems, e.g. line fitting, homography, affine transformation, fundamental essential matrix estimation. runs real-time many problems speed approximately equal less accurate alternatives (in milliseconds standard cpu).",4 "sparse neural networks large learning diversity. coded recurrent neural networks three levels sparsity introduced. first level related size messages, much smaller number available neurons. second one provided particular coding rule, acting local constraint neural activity. third one characteristic low final connection density network learning phase. though proposed network simple since based binary neurons binary connections, able learn large number messages recall them, even presence strong erasures. performance network assessed classifier associative memory.",4 "kalman temporal differences. reinforcement learning suffers lack scalability, online value (and q-) function approximation received increasing interest last decade. contribution introduces novel approximation scheme, namely kalman temporal differences (ktd) framework, exhibits following features: sample-efficiency, non-linear approximation, non-stationarity handling uncertainty management. first ktd-based algorithm provided deterministic markov decision processes (mdp) produces biased estimates case stochastic transitions. extended ktd framework (xktd), solving stochastic mdp, described. convergence analyzed special cases deterministic stochastic transitions. related algorithms experimented classical benchmarks. compare favorably state art exhibiting announced features.",4 "low-rank modeling applications image analysis. low-rank modeling generally refers class methods solve problems representing variables interest low-rank matrices. achieved great success various fields including computer vision, data mining, signal processing bioinformatics. recently, much progress made theories, algorithms applications low-rank modeling, exact low-rank matrix recovery via convex programming matrix completion applied collaborative filtering. advances brought attentions topic. paper, review recent advance low-rank modeling, state-of-the-art algorithms, related applications image analysis. first give overview concept low-rank modeling challenging problems area. then, summarize models algorithms low-rank matrix recovery illustrate advantages limitations numerical experiments. next, introduce applications low-rank modeling context image analysis. finally, conclude paper discussions.",4 "development comprehensive devnagari numeral character database offline handwritten character recognition. handwritten character recognition, benchmark database plays important role evaluating performance various algorithms results obtained various researchers. devnagari script, lack official benchmark. paper focuses generation offline benchmark database devnagari handwritten numerals characters. present work generated 5137 20305 isolated samples numeral character database, respectively, 750 writers ages, sex, education, profession. offline sample images stored tiff image format occupies less memory. also, data presented binary level memory requirement reduced. facilitate research handwriting recognition devnagari script free access researchers.",4 "pulcinella: general tool propagating uncertainty valuation networks. present pulcinella use comparing uncertainty theories. pulcinella general tool propagating uncertainty based local computation technique shafer shenoy. may specialized different uncertainty theories: moment, pulcinella propagate probabilities, belief functions, boolean values, possibilities. moreover, pulcinella allows user easily define specializations. illustrate pulcinella, analyze two examples using four theories above. first one, mainly focus intrinsic differences theories. second one, take knowledge engineer viewpoint, check adequacy theory given problem.",4 "automated news suggestions populating wikipedia entity pages. wikipedia entity pages valuable source information direct consumption knowledge-base construction, update maintenance. facts entity pages typically supported references. recent studies show much 20\% references online news sources. however, many entity pages incomplete even relevant information already available existing news articles. even already present references, often delay news article publication time reference time. work, therefore look wikipedia lens news propose novel news-article suggestion task improve news coverage wikipedia, reduce lag newsworthy references. work finds direct application, precursor, wikipedia page generation knowledge-base acceleration tasks rely relevant high quality input sources. propose two-stage supervised approach suggesting news articles entity pages given state wikipedia. first, suggest news articles wikipedia entities (article-entity placement) relying rich set features take account \emph{salience} \emph{relative authority} entities, \emph{novelty} news articles entity pages. second, determine exact section entity page input article (article-section placement) guided class-based section templates. perform extensive evaluation approach based ground-truth data extracted external references wikipedia. achieve high precision value 93\% \emph{article-entity} suggestion stage upto 84\% \emph{article-section placement}. finally, compare approach competitive baselines show significant improvements.",4 "motor learning mechanism neuron scale. based existing data, wish put forward biological model motor system neuron scale. indicate implications statistics learning. specifically, neuron firing frequency synaptic strength probability estimates essence. lateral inhibition also statistical implications. standpoint learning, dendritic competition retrograde messengers foundation conditional reflex grandmother cell coding. kernel mechanisms motor learning sensory motor integration respectively. finally, compare motor system sensory system. short, would like bridge gap molecule evidences computational models.",16 "asking much? rhetorical role questions political discourse. questions play prominent role social interactions, performing rhetorical functions go beyond simple informational exchange. surface form question signal intention background person asking it, well nature relation interlocutor. informational nature questions extensively examined context question-answering applications, rhetorical aspects largely understudied. work introduce unsupervised methodology extracting surface motifs recur questions, grouping according latent rhetorical role. applying framework setting question sessions uk parliament, show resulting typology encodes key aspects political discourse---such bifurcation questioning behavior government opposition parties---and reveals new insights effects legislator's tenure political career ambitions.",4 "classification regions deep neural networks. goal paper analyze geometric properties deep neural network classifiers input space. specifically study topology classification regions created deep networks, well associated decision boundary. systematic empirical investigation, show state-of-the-art deep nets learn connected classification regions, decision boundary vicinity datapoints flat along directions. draw essential connection two seemingly unrelated properties deep networks: sensitivity additive perturbations inputs, curvature decision boundary. directions decision boundary curved fact remarkably characterize directions classifier vulnerable. finally leverage fundamental asymmetry curvature decision boundary deep nets, propose method discriminate original images, images perturbed small adversarial examples. show effectiveness purely geometric approach detecting small adversarial perturbations images, recovering labels perturbed images.",4 "deep occlusion reasoning multi-camera multi-target detection. people detection single 2d images improved greatly recent years. however, comparatively little progress percolated multi-camera multi-people tracking algorithms, whose performance still degrades severely scenes become crowded. work, introduce new architecture combines convolutional neural nets conditional random fields explicitly model ambiguities. one key ingredients high-order crf terms model potential occlusions give approach robustness even many people present. model trained end-to-end show outperforms several state-of-art algorithms challenging scenes.",4 "workflow complexity collaborative interactions: metrics? -- challenge. paper, introduce problem denoting deriving complexity workflows (plans, schedules) collaborative, planner-assisted settings humans agents trying jointly solve task. interactions -- hence workflows connect human agents -- may differ according domain kind agents. adapt insights prior work human-agent teaming workflow analysis suggest metrics workflow complexity. main motivation behind work highlight metrics human comprehensibility plans schedules. planning community seen fair share work synthesis plans take diversity account -- value plans hold generation guided least part metrics reflect ease engaging using plans?",4 "modeling multiple annotator expertise semi-supervised learning scenario. learning algorithms normally assume one annotation label per data point. however, scenarios, medical diagnosis on-line collaboration,multiple annotations may available. either case, obtaining labels data points expensive time-consuming (in circumstances ground-truth may exist). semi-supervised learning approaches shown utilizing unlabeled data often beneficial cases. paper presents probabilistic semi-supervised model algorithm allows learning unlabeled labeled data presence multiple annotators. assume known annotator labeled data points. proposed approach produces annotator models allow us provide (1) estimates true label (2) annotator variable expertise labeled unlabeled data. provide numerical comparisons various scenarios respect standard semi-supervised learning. experiments showed presented approach provides clear advantages multi-annotator methods use unlabeled data methods use multi-labeler information.",4 "elicit many probabilities. building bayesian belief networks, elicitation probabilities required major obstacle. learned extent often-cited observation construction probabilistic part complex influence diagram field cancer treatment. based upon negative experiences existing methods, designed new method probability elicitation domain experts. method combines various ideas, among ideas transcribing probabilities using scale numerical verbal anchors marking assessments. construction probabilistic part influence diagram, method proved allow elicitation many probabilities little time.",4 "reasoning cardinal directions extended objects: hardness result. cardinal direction calculus (cdc) proposed goyal egenhofer expressive qualitative calculus directional information extended objects. early work shown consistency checking complete networks basic cdc constraints tractable reasoning cdc general np-hard. paper shows, however, allowing constraints unspecified, consistency checking possibly incomplete networks basic cdc constraints already intractable. draws sharp boundary tractable intractable subclasses cdc. result achieved reduction well-known 3-sat problem.",4 "autoencoder based image compression: learning quantization independent?. paper explores problem learning transforms image compression via autoencoders. usually, rate-distortion performances image compression tuned varying quantization step size. case autoen-coders, principle would require learning one transform per rate-distortion point given quantization step size. here, show comparable performances obtained unique learned transform. different rate-distortion points reached varying quantization step size test time. approach saves lot training time.",6 "500+ times faster deep learning (a case study exploring faster methods text mining stackoverflow). deep learning methods useful high-dimensional data becoming widely used many areas software engineering. deep learners utilizes extensive computational power take long time train-- making difficult widely validate repeat improve results. further, best solution domains. example, recent results show finding related stack overflow posts, tuned svm performs similarly deep learner, significantly faster train. paper extends recent result clustering dataset, tuning learners within cluster. approach 500 times faster deep learning (and 900 times faster use cores standard laptop computer). significantly, faster approach generates classifiers nearly good (within 2\% f1 score) much slower deep learning method. hence recommend faster methods since much easier reproduce utilizes far fewer cpu resources. generally, recommend researchers release research results, compare supposedly sophisticated methods simpler alternatives (e.g applying simpler learners build local models).",4 "character-based neural machine translation. neural machine translation (mt) reached state-of-the-art results. however, one main challenges neural mt still faces dealing large vocabularies morphologically rich languages. paper, propose neural mt system using character-based embeddings combination convolutional highway layers replace standard lookup-based word representations. resulting unlimited-vocabulary affix-aware source word embeddings tested state-of-the-art neural mt based attention-based bidirectional recurrent neural network. proposed mt scheme provides improved results even source language morphologically rich. improvements 3 bleu points obtained german-english wmt task.",4 "neural nilm: deep neural networks applied energy disaggregation. energy disaggregation estimates appliance-by-appliance electricity consumption single meter measures whole home's electricity demand. recently, deep neural networks driven remarkable improvements classification performance neighbouring machine learning fields image classification automatic speech recognition. paper, adapt three deep neural network architectures energy disaggregation: 1) form recurrent neural network called `long short-term memory' (lstm); 2) denoising autoencoders; 3) network regresses start time, end time average power demand appliance activation. use seven metrics test performance algorithms real aggregate power data five appliances. tests performed house seen training houses seen training. find three neural nets achieve better f1 scores (averaged five appliances) either combinatorial optimisation factorial hidden markov models neural net algorithms generalise well unseen house.",4 "towards understanding triangle construction problems. straightedge compass construction problems one oldest challenging problems elementary mathematics. central challenge, human computer program, solving construction problems huge search space. paper analyze one family triangle construction problems, aiming detecting small core underlying geometry knowledge. analysis leads small set needed definitions, lemmas primitive construction steps, consequently, simple algorithm automated solving problems family. approach applied families construction problems.",4 "open open source be?. many open-source projects land security fixes public repositories shipping patches users. paper presents attacks projects - taking firefox case-study - exploit patch metadata efficiently search security patches prior shipping. using access-restricted bug reports linked patch descriptions, security patches immediately identified 260 300 days firefox 3 development. response mozilla obfuscating descriptions, show machine learning exploit metadata patch author search security patches, extending total window vulnerability 5 months 8 month period examining two patches daily. finally present strong evidence metadata obfuscation unlikely prevent information leaks, argue open-source projects instead ought keep security patches secret ready released.",4 "improving automatic emotion recognition speech using rhythm temporal feature. paper devoted improve automatic emotion recognition speech incorporating rhythm temporal features. research automatic emotion recognition far mostly based applying features like mfccs, pitch energy intensity. idea focuses borrowing rhythm features linguistic phonetic analysis applying speech signal basis acoustic knowledge only. addition exploit set temporal loudness features. segmentation unit employed starting separate voiced/unvoiced silence parts features explored different segments. thereafter different classifiers used classification. selecting top features using igr filter able achieve recognition rate 80.60 % berlin emotion database speaker dependent framework.",4 "efficient scene text localization recognition local character refinement. unconstrained end-to-end text localization recognition method presented. method detects initial text hypothesis single pass efficient region-based method subsequently refines text hypothesis using robust local text model, deviates common assumption region-based methods characters detected connected components. additionally, novel feature based character stroke area estimation introduced. feature efficiently computed region distance map, invariant scaling rotations allows efficiently detect text regions regardless portion text capture. method runs real time achieves state-of-the-art text localization recognition results icdar 2013 robust reading dataset.",4 "recognition performance structured language model. new language model speech recognition inspired linguistic analysis presented. model develops hidden hierarchical structure incrementally uses extract meaningful information word history - thus enabling use extended distance dependencies - attempt complement locality currently used trigram models. structured language model, probabilistic parameterization performance two-pass speech recognizer presented. experiments switchboard corpus show improvement perplexity word error rate conventional trigram models.",4 "interacting conceptual spaces. propose applying categorical compositional scheme [6] conceptual space models cognition. order introduce category convex relations new setting categorical compositional semantics, emphasizing convex structure important conceptual space applications. show conceptual spaces composite types adjectives verbs constructed. illustrate new model detailed examples.",4 "finite-state phonology: proceedings 5th workshop acl special interest group computational phonology (sigphon). home page workshop proceedings, pointers individually archived papers. includes front matter printed version proceedings.",4 "lstmvis: tool visual analysis hidden state dynamics recurrent neural networks. recurrent neural networks, particular long short-term memory (lstm) networks, remarkably effective tool sequence modeling learn dense black-box hidden representation sequential input. researchers interested better understanding models studied changes hidden state representations time noticed interpretable patterns also significant noise. work, present lstmvis, visual analysis tool recurrent neural networks focus understanding hidden state dynamics. tool allows users select hypothesis input range focus local state changes, match states changes similar patterns large data set, align results structural annotations domain. show several use cases tool analyzing specific hidden state properties dataset containing nesting, phrase structure, chord progressions, demonstrate tool used isolate patterns statistical analysis. characterize domain, different stakeholders, goals tasks.",4 "neural aesthetic image reviewer. recently, rising interest perceiving image aesthetics. existing works deal image aesthetics classification regression problem. extend cognition rating reasoning, deeper understanding aesthetics based revealing high- low-aesthetic score assigned image. point view, propose model referred neural aesthetic image reviewer, give aesthetic score image, also generate textual description explaining image leads plausible rating score. specifically, propose two multi-task architectures based shared aesthetically semantic layers task-specific embedding layers high level performance improvement different tasks. facilitate researches problem, collect ava-reviews dataset, contains 52,118 images 312,708 comments total. multi-task learning, proposed models rate aesthetic images well produce comments end-to-end manner. confirmed proposed models outperform baselines according performance evaluation ava-reviews dataset. moreover, demonstrate experimentally model generate textual reviews related aesthetics, consistent human perception.",4 "boltzmann machines denoising autoencoders image denoising. image denoising based probabilistic model local image patches employed various researchers, recently deep (denoising) autoencoder proposed burger et al. [2012] xie et al. [2012] good model this. paper, propose another popular family models field deep learning, called boltzmann machines, perform image denoising well as, certain cases high level noise, better denoising autoencoders. empirically evaluate two models three different sets images different types levels noise. throughout experiments also examine effect depth models. experiments confirmed claim revealed performance improved adding hidden layers, especially level noise high.",19 measurement amplitude moiré patterns digital autostereoscopic 3d display. article presents experimental measurements amplitude moir\'e patterns digital autostereoscopic barrier-type 3d display across wide angular range small increment. period orientation moir\'e patterns also measured functions angle. simultaneous branches observed analyzed. theoretical interpretation also given. results help preventing minimizing moir\'e effect displays.,6 "ensembles classifiers based dimensionality reduction. present novel approach construction ensemble classifiers based dimensionality reduction. dimensionality reduction methods represent datasets using small number attributes preserving information conveyed original dataset. ensemble members trained based dimension-reduced versions training set. versions obtained applying dimensionality reduction original training set using different values input parameters. construction meets diversity accuracy criteria required construct ensemble classifier former criterion obtained various input parameter values latter achieved due decorrelation noise reduction properties dimensionality reduction. order classify test sample, first embedded dimension reduced space individual classifier using out-of-sample extension algorithm. classifier applied embedded sample classification obtained via voting scheme. present three variations proposed approach based random projections, diffusion maps random subspaces dimensionality reduction algorithms. also present multi-strategy ensemble combines adaboost diffusion maps. comparison made bagging, adaboost, rotation forest ensemble classifiers also base classifier incorporate dimensionality reduction. experiments used seventeen benchmark datasets uci repository. results obtained proposed algorithms superior many cases algorithms.",4 "region-based image retrieval revisited. region-based image retrieval (rbir) technique revisited. early attempts rbir late 90s, researchers found many ways specify region-based queries spatial relationships; however, way characterize regions, using color histograms, poor time. here, revisit rbir incorporating semantic specification objects intuitive specification spatial relationships. contributions following. first, support multiple aspects semantic object specification (category, instance, attribute), propose multitask cnn feature allows us use deep learning technique jointly handle multi-aspect object specification. second, help users specify spatial relationships among objects intuitive way, propose recommendation techniques spatial relationships. particular, mining search results, system recommend feasible spatial relationships among objects. system also recommend likely spatial relationships assigned object category names based language prior. moreover, object-level inverted indexing supports fast shortlist generation, re-ranking based spatial constraints provides users instant rbir experiences.",4 "reinforcement learning based argument component detection. argument component detection (acd) important sub-task argumentation mining. acd aims detecting classifying different argument components natural language texts. historical annotations (has) important features human annotators consider manually perform acd task. however, largely ignored existing automatic acd techniques. reinforcement learning (rl) proven effective method using natural language processing tasks. work, propose rl-based acd technique, evaluate performance two well-annotated corpora. results suggest that, terms classification accuracy, has-augmented rl outperforms plain rl 17.85%, outperforms state-of-the-art supervised learning algorithm 11.94%.",4 "autoregressive kernels time series. propose work new family kernels variable-length time series. work builds upon vector autoregressive (var) model multivariate stochastic processes: given multivariate time series x, consider likelihood function p_{\theta}(x) different parameters \theta var model features describe x. compare two time series x x', form product features p_{\theta}(x) p_{\theta}(x') integrated w.r.t \theta using matrix normal-inverse wishart prior. among properties, kernel easily computed dimension time series much larger lengths considered time series x x'. also generalized time series taking values arbitrary state spaces, long state space endowed kernel \kappa. case, kernel x x' function gram matrices produced \kappa observations subsequences observations enumerated x x'. describe computationally efficient implementation generalization uses low-rank matrix factorization techniques. kernels compared known kernels using set benchmark classification tasks carried support vector machines.",19 multiplierless modules forward backward integer wavelet transform. article architecture lossless wavelet filter bank reprogrammable logic. based second generation wavelets reduced number operations. new basic structure parallel architecture modules forward backward integer discrete wavelet transform proposed.,4 "recruitment market trend analysis sequential latent variable models. recruitment market analysis provides valuable understanding industry-specific economic growth plays important role employers job seekers. rapid development online recruitment services, massive recruitment data accumulated enable new paradigm recruitment market analysis. however, traditional methods recruitment market analysis largely rely knowledge domain experts classic statistical models, usually general model large-scale dynamic recruitment data, difficulties capture fine-grained market trends. end, paper, propose new research paradigm recruitment market analysis leveraging unsupervised learning techniques automatically discovering recruitment market trends based large-scale recruitment data. specifically, develop novel sequential latent variable model, named mtlvm, designed capturing sequential dependencies corporate recruitment states able automatically learn latent recruitment topics within bayesian generative framework. particular, capture variability recruitment topics time, design hierarchical dirichlet processes mtlvm. processes allow dynamically generate evolving recruitment topics. finally, implement prototype system empirically evaluate approach based real-world recruitment data china. indeed, visualizing results mtlvm, successfully reveal many interesting findings, popularity lbs related jobs reached peak 2nd half 2014, decreased 2015.",4 "automatic road lighting system (arls) model based image processing moving object. using vehicle toy (in next future called vehicle) moving object automatic road lighting system (arls) model constructed. digital video camera 25 fps used capture vehicle motion moves test segment road. captured images processed calculate vehicle speed. information speed together position vehicle used control lighting system along path passes vehicle. length road test segment 1 m, video camera positioned 1.1 test segment, vehicle toy dimension 13 cm \times 9.3 cm. model, maximum speed arls handle 1.32 m/s, highest performance obtained 91% speed 0.93 m/s.",4 "input-to-output gate improve rnn language models. paper proposes reinforcing method refines output layers existing recurrent neural network (rnn) language models. refer proposed method input-to-output gate (iog). iog extremely simple structure, thus, easily combined rnn language models. experiments penn treebank wikitext-2 datasets demonstrate iog consistently boosts performance several different types current topline rnn language models.",4 "multinomial adversarial networks multi-domain text classification. many text classification tasks known highly domain-dependent. unfortunately, availability training data vary drastically across domains. worse still, domains may annotated data all. work, propose multinomial adversarial network (man) tackle text classification problem real-world multidomain setting (mdtc). provide theoretical justifications man framework, proving different instances mans essentially minimizers various f-divergence metrics (ali silvey, 1966) among multiple probability distributions. mans thus theoretically sound generalization traditional adversarial networks discriminate two distributions. specifically, mdtc task, man learns features invariant across multiple domains resorting ability reduce divergence among feature distributions domain. present experimental results showing mans significantly outperform prior art mdtc task. also show mans achieve state-of-the-art performance domains labeled data.",4 "efficient post-selection inference high-order interaction models. finding statistically significant high-order interaction features predictive modeling important challenging task. difficulty lies fact that, recent applications high-dimensional covariates, number possible high-order interaction features would extremely large. identifying statistically significant features huge pool candidates would highly challenging computational statistical senses. work problem, consider two stage algorithm first select set high-order interaction features marginal screening, make statistical inferences regression model fitted selected features. statistical inferences called post-selection inference (psi), receiving increasing attention literature. one seminal recent advancements psi literature works lee et al. authors presented algorithmic framework computing exact sampling distributions psi. main challenge applying approach high-order interaction models cope fact psi general depends selected features also unselected features, making hard apply extremely high-dimensional high-order interaction models. goal paper overcome difficulty introducing novel efficient method psi. key idea exploit underlying tree structure among high-order interaction features, develop pruning method tree enables us quickly identify group unselected features guaranteed influence psi. experimental results indicate proposed method allows us reliably identify statistically significant high-order interaction features reasonable computational cost.",19 "nystrom method approximating gmm kernel. gmm (generalized min-max) kernel recently proposed (li, 2016) measure data similarity demonstrated effective machine learning tasks. order use gmm kernel large-scale datasets, prior work resorted (generalized) consistent weighted sampling (gcws) convert gmm kernel linear kernel. call approach ``gmm-gcws''. machine learning literature, popular algorithm call ``rbf-rff''. is, one use ``random fourier features'' (rff) convert ``radial basis function'' (rbf) kernel linear kernel. empirically shown (li, 2016) rbf-rff typically requires substantially samples gmm-gcws order achieve comparable accuracies. nystrom method general tool computing nonlinear kernels, converts nonlinear kernels linear kernels. apply nystrom method approximating gmm kernel, strategy name ``gmm-nys''. study, extensive experiments set fairly large datasets confirm gmm-nys also strong competitor rbf-rff.",19 "identifying trends word frequency dynamics. word-stock language complex dynamical system words created, evolve, become extinct. even dynamic short-term fluctuations word usage individuals population. building recent demonstration word niche strong determinant future rise fall word frequency, introduce model allows us distinguish persistent temporary increases frequency. model illustrated using 10^8-word database online discussion group 10^11-word collection digitized books. model reveals strong relation changes word dissemination changes frequency. aside implications short-term word frequency dynamics, observations potentially important language evolution new words must survive short term order survive long term.",15 "transductive zero-shot hashing via coarse-to-fine similarity mining. zero-shot hashing (zsh) learn hashing models novel/target classes without training data, important challenging problem. existing zsh approaches exploit transfer learning via intermediate shared semantic representations seen/source classes novel/target classes. however, due disjoint, hash functions learned source dataset biased applied directly target classes. paper, study transductive zsh, i.e., unlabeled data novel classes. put forward simple yet efficient joint learning approach via coarse-to-fine similarity mining transfers knowledges source data target data. mainly consists two building blocks proposed deep architecture: 1) shared two-streams network, first stream operates source data second stream operates unlabeled data, learn effective common image representations, 2) coarse-to-fine module, begins finding representative images target classes detect similarities among images, transfer similarities source data target data greedy fashion. extensive evaluation results several benchmark datasets demonstrate proposed hashing method achieves significant improvement state-of-the-art methods.",4 "multi-task policy search. learning policies generalize across multiple tasks important challenging research topic reinforcement learning robotics. training individual policies every single potential task often impractical, especially continuous task variations, requiring principled approaches share transfer knowledge among similar tasks. present novel approach learning nonlinear feedback policy generalizes across multiple tasks. key idea define parametrized policy function state task, allows learning single policy generalizes across multiple known unknown tasks. applications novel approach reinforcement imitation learning real-robot experiments shown.",19 "composing music grammar argumented neural networks note-level encoding. creating aesthetically pleasing pieces art, including music, long-term goal artificial intelligence research. despite recent successes long-short term memory (lstm) recurrent neural networks (rnns) sequential learning, lstm neural networks not, themselves, able generate natural-sounding music conforming music theory. transcend inadequacy, put forward novel method music composition combines lstm grammars motivated music theory. main tenets music theory encoded grammar argumented (ga) filters training data, machine trained generate music inheriting naturalness human-composed pieces original dataset adhering rules music theory. unlike previous approaches, pitches durations encoded one semantic entity, refer note-level encoding. allows easy implementation music theory grammars, well closer emulation thinking pattern musician. although ga rules applied training data never directly lstm music generation, machine still composes music possess high incidences diatonic scale notes, small pitch intervals chords, deference music theory.",4 "nice: non-linear independent components estimation. propose deep learning framework modeling complex high-dimensional densities called non-linear independent component estimation (nice). based idea good representation one data distribution easy model. purpose, non-linear deterministic transformation data learned maps latent space make transformed data conform factorized distribution, i.e., resulting independent latent variables. parametrize transformation computing jacobian determinant inverse transform trivial, yet maintain ability learn complex non-linear transformations, via composition simple building blocks, based deep neural network. training criterion simply exact log-likelihood, tractable. unbiased ancestral sampling also easy. show approach yields good generative models four image datasets used inpainting.",4 "automatic spatially-aware fashion concept discovery. paper proposes automatic spatially-aware concept discovery approach using weakly labeled image-text data shopping websites. first fine-tune googlenet jointly modeling clothing images corresponding descriptions visual-semantic embedding space. then, attribute (word), generate spatially-aware representation combining semantic word vector representation spatial representation derived convolutional maps fine-tuned network. resulting spatially-aware representations used cluster attributes multiple groups form spatially-aware concepts (e.g., neckline concept might consist attributes like v-neck, round-neck, etc). finally, decompose visual-semantic embedding space multiple concept-specific subspaces, facilitates structured browsing attribute-feedback product retrieval exploiting multimodal linguistic regularities. conducted extensive experiments newly collected fashion200k dataset, results clustering quality evaluation attribute-feedback product retrieval task demonstrate effectiveness automatically discovered spatially-aware concepts.",4 "towards understanding neural networks natural-image spaces. two major uncertainties, dataset bias perturbation, prevail state-of-the-art ai algorithms deep neural networks. paper, present intuitive explanation issues well interpretation performance deep networks natural-image space. explanation consists two parts: philosophy neural networks hypothetic model natural-image spaces. following explanation, slightly improve accuracy cifar-10 classifier introducing additional ""random-noise"" category training. hope paper stimulate discussion community regarding topological geometric properties natural-image spaces deep networks applied.",4 "goedel machines: self-referential universal problem solvers making provably optimal self-improvements. present first class mathematically rigorous, general, fully self-referential, self-improving, optimally efficient problem solvers. inspired kurt goedel's celebrated self-referential formulas (1931), problem solver rewrites part code soon found proof rewrite useful, problem-dependent utility function hardware entire initial code described axioms encoded initial proof searcher also part initial code. searcher systematically efficiently tests computable proof techniques (programs whose outputs proofs) finds provably useful, computable self-rewrite. show self-rewrite globally optimal - local maxima! - since code first prove useful continue proof search alternative self-rewrites. unlike previous non-self-referential methods based hardwired proof searchers, boasts optimal order complexity optimally reduce slowdowns hidden o()-notation, provided utility speed-ups provable all.",4 "cp-nets nash equilibria. relate two formalisms used different purposes reasoning multi-agent systems. one strategic games used capture idea agents interact pursuing interest. cp-nets introduced express qualitative conditional preferences users aim facilitating process preference elicitation. relate two formalisms introduce natural, qualitative, extension notion strategic game. show optimal outcomes cp-net exactly nash equilibria appropriately defined strategic game sense. allows us use techniques game theory search optimal outcomes cp-nets vice-versa, use techniques developed cp-nets search nash equilibria considered games.",4 "bioinformatics medicine era deep learning. many current scientific advances life sciences origin intensive use data knowledge discovery. area clear bioinformatics, led technological breakthroughs data acquisition technologies. argued bioinformatics could quickly become field research generating largest data repositories, beating data-intensive areas high-energy physics astroinformatics. last decade, deep learning become disruptive advance machine learning, giving new live long-standing connectionist paradigm artificial intelligence. deep learning methods ideally suited large-scale data and, therefore, ideally suited knowledge discovery bioinformatics biomedicine large. brief paper, review key aspects application deep learning bioinformatics medicine, drawing themes covered contributions esann 2018 special session devoted topic.",4 "contribution case based reasoning (cbr) exploitation return experience. application accident scenarii railroad transport. study base accident scenarii rail transport (feedback) order develop tool share build sustain knowledge safety secondly exploit knowledge stored prevent reproduction accidents / incidents. tool ultimately lead proposal prevention protection measures minimize risk level new transport system thus improve safety. approach achieving goal largely depends use artificial intelligence techniques rarely use method automatic learning order develop feasibility model software tool based case based reasoning (cbr) exploit stored knowledge order create know-how help stimulate domain experts task analysis, evaluation certification new system.",4 "systematic derivation behaviour characterisations evolutionary robotics. evolutionary techniques driven behavioural diversity, novelty search, shown significant potential evolutionary robotics. techniques rely priorly specified behaviour characterisations estimate similarity individuals. characterisations typically defined ad hoc manner based experimenter's intuition knowledge task. alternatively, generic characterisations based sensor-effector values agents used. paper, propose novel approach allows systematic derivation behaviour characterisations evolutionary robotics, based formal description agents environment. systematically derived behaviour characterisations (sdbcs) go beyond generic characterisations contain task-specific features related internal state agents, environmental features, relations them. evaluate sdbcs novelty search three simulated collective robotics tasks. results show sdbcs yield performance comparable task-specific characterisations, terms solution quality behaviour space exploration.",4 "finding near-optimal independent sets scale. independent set problem np-hard particularly difficult solve large sparse graphs. work, develop advanced evolutionary algorithm, incorporates kernelization techniques compute large independent sets huge sparse networks. recent exact algorithm shown large networks solved exactly employing branch-and-reduce technique recursively kernelizes graph performs branching. however, one major drawback algorithm that, huge graphs, branching still take exponential time. avoid problem, recursively choose vertices likely large independent set (using evolutionary approach), kernelize graph. show identifying removing vertices likely large independent sets opens reduction space---which speeds computation large independent sets drastically, also enables us compute high-quality independent sets much larger instances previously reported literature.",4 "bayesian learning undirected graphical models: approximate mcmc algorithms. bayesian learning undirected graphical models|computing posterior distributions parameters predictive quantities exceptionally difficult. conjecture general undirected models, tractable mcmc (markov chain monte carlo) schemes giving correct equilibrium distribution parameters. intractability, due partition function, familiar performing parameter optimisation, bayesian learning posterior distributions undirected model parameters unexplored poses novel challenges. propose several approximate mcmc schemes test fully observed binary models (boltzmann machines) small coronary heart disease data set larger artificial systems. approximations must perform well model, interaction sampling scheme also important. samplers based variational mean- field approximations generally performed poorly, advanced methods using loopy propagation, brief sampling stochastic dynamics lead acceptable parameter posteriors. finally, demonstrate techniques markov random field hidden variables.",4 "6d object pose estimation depth images: seamless approach robotic interaction augmented reality. determine 3d orientation 3d location objects surroundings camera mounted robot mobile device, developed two powerful algorithms object detection temporal tracking combined seamlessly robotic perception interaction well augmented reality (ar). separate evaluation of, respectively, object detection temporal tracker demonstrates important stride research well impact industrial robotic applications ar. evaluated standard dataset, detector produced highest f1-score large margin tracker generated best accuracy low latency approximately 2 ms per frame one cpu core: algorithms outperforming state art. combined, achieve powerful framework robust handle multiple instances object occlusion clutter attaining real-time performance. aiming stepping beyond simple scenarios used current systems, often constrained single object absence clutter, averting touch object prevent close-range partial occlusion, selecting brightly colored objects easily segment individually assuming object simple geometric structure, demonstrate capacity handle challenging cases clutter, partial occlusion varying lighting conditions objects different shapes sizes.",4 "fluency adequacy: pilot study measuring user trust imperfect mt. although measuring intrinsic quality key factor advancement machine translation (mt), successfully deploying mt requires considering intrinsic quality also user experience, including aspects trust. work introduces method studying users modulate trust mt system seeing errorful (disfluent inadequate) output amidst good (fluent adequate) output. conduct survey determine users respond good translations compared translations either adequate fluent, fluent adequate. pilot study, users responded strongly disfluent translations, were, surprisingly, much less concerned adequacy.",4 "latent skill embedding personalized lesson sequence recommendation. students online courses generate large amounts data used personalize learning process improve quality education. paper, present latent skill embedding (lse), probabilistic model students educational content used recommend personalized sequences lessons goal helping students prepare specific assessments. akin collaborative filtering recommender systems, algorithm require students content described features, learns representation using access traces. formulate problem regularized maximum-likelihood embedding students, lessons, assessments historical student-content interactions. empirical evaluation large-scale data knewton, adaptive learning technology company, shows approach predicts assessment results competitively benchmark models able discriminate lesson sequences lead mastery failure.",4 "parallel graph partitioning complex networks. processing large complex networks like social networks web graphs recently attracted considerable interest. order parallel, need partition pieces equal size. unfortunately, previous parallel graph partitioners originally developed regular mesh-like networks work well networks. paper addresses problem parallelizing adapting label propagation technique originally developed graph clustering. introducing size constraints, label propagation becomes applicable coarsening refinement phase multilevel graph partitioning. obtain high quality applying highly parallel evolutionary algorithm coarsened graph. resulting system scalable achieves higher quality state-of-the-art systems like parmetis pt-scotch. large complex networks performance differences big. example, algorithm partition web graph 3.3 billion edges less sixteen seconds using 512 cores high performance cluster producing high quality partition -- none competing systems handle graph system.",4 "manifold regularized kernel logistic regression web image annotation. rapid advance internet technology smart devices, users often need manage large amounts multimedia information using smart devices, personal image video accessing browsing. requirements heavily rely success image (video) annotation, thus large scale image annotation innovative machine learning methods attracted intensive attention recent years. one representative work support vector machine (svm). although works well binary classification, svm non-smooth loss function naturally cover multi-class case. paper, propose manifold regularized kernel logistic regression (klr) web image annotation. compared svm, klr following advantages: (1) klr smooth loss function; (2) klr produces explicit estimate probability instead class label; (3) klr naturally generalized multi-class case. carefully conduct experiments mir flickr dataset demonstrate effectiveness manifold regularized kernel logistic regression image annotation.",4 "deep predictive coding networks video prediction unsupervised learning. great strides made using deep learning algorithms solve supervised learning tasks, problem unsupervised learning - leveraging unlabeled examples learn structure domain - remains difficult unsolved challenge. here, explore prediction future frames video sequence unsupervised learning rule learning structure visual world. describe predictive neural network (""prednet"") architecture inspired concept ""predictive coding"" neuroscience literature. networks learn predict future frames video sequence, layer network making local predictions forwarding deviations predictions subsequent network layers. show networks able robustly learn predict movement synthetic (rendered) objects, so, networks learn internal representations useful decoding latent object parameters (e.g. pose) support object recognition fewer training views. also show networks scale complex natural image streams (car-mounted camera videos), capturing key aspects egocentric movement movement objects visual scene, representation learned setting useful estimating steering angle. altogether, results suggest prediction represents powerful framework unsupervised learning, allowing implicit learning object scene structure.",4 "alternative rdf-based languages representation processing ontologies semantic web. paper describes approach representation processing ontologies semantic web, based icmaus theory computation ai. approach strengths complement languages based resource description framework (rdf) rdf schema daml+oil. main benefits icmaus approach simplicity comprehensibility representation ontologies, ability cope errors uncertainties knowledge, versatile reasoning system capabilities kinds probabilistic reasoning seem required semantic web.",4 "logical difference lightweight description logic el. study logic-based approach versioning ontologies. view, ontologies provide answers queries vocabulary interest. difference two versions ontology given set queries receive different answers. investigate approach terminologies given description logic el extended role inclusions domain range restrictions three distinct types queries: subsumption, instance, conjunctive queries. three cases, present polynomial-time algorithms decide whether two terminologies give answers queries given vocabulary compute succinct representation difference non- empty. present implementation, cex2, developed algorithms subsumption instance queries apply distinct versions snomed ct nci ontology.",4 "parkinson's disease patient rehabilitation using gaming platforms: lessons learnt. parkinson's disease (pd) progressive neurodegenerative movement disorder motor dysfunction gradually increases disease progress. addition administering dopaminergic pd-specific drugs, attending neurologists strongly recommend regular exercise combined physiotherapy. however, long-term nature disease, patients following traditional rehabilitation programs may get bored, lose interest eventually drop direct result repeatability predictability prescribed exercises. technology supported opportunities liven daily exercise schedule appeared form character-based, virtual reality games promote physical training non-linear looser fashion provide experience varies one game loop next. ""exergames"", word results amalgamation words ""exercise"" ""game"" challenge patients performing movements varying complexity playful immersive virtual environment. today's game consoles nintendo's wii, sony playstation eye microsoft's kinect sensor present new opportunities infuse motivation variety otherwise mundane physiotherapy routine. paper present approaches, discuss suitability pd patients, mainly basis demands made balance, agility gesture precision, present design principles exergame platforms must comply order suitable pd patients.",4 "evolutionary optimization experimental apparatus. recent decades, cold atom experiments become increasingly complex. computers control parameters, optimization mostly done manually. time-consuming task high-dimensional parameter space unknown correlations. automate process using genetic algorithm based differential evolution. demonstrate algorithm optimizes 21 correlated parameters robust local maxima experimental noise. algorithm flexible easy implement. thus, presented scheme applied wide range experimental optimization tasks.",18 "connection bayesian estimation gaussian random field rkhs. reconstruction function noisy data often formulated regularized optimization problem infinite-dimensional reproducing kernel hilbert space (rkhs). solution describes observed data small rkhs norm. data fit measured using quadratic loss, estimator known statistical interpretation. given noisy measurements, rkhs estimate represents posterior mean (minimum variance estimate) gaussian random field covariance proportional kernel associated rkhs. paper, provide statistical interpretation general losses used, absolute value, vapnik huber. specifically, finite set sampling locations (including data collected), map estimate signal samples given rkhs estimate evaluated locations.",19 "incorporating feedback tree-based anomaly detection. anomaly detectors often used produce ranked list statistical anomalies, examined human analysts order extract actual anomalies interest. unfortunately, realworld applications, process exceedingly difficult analyst since large fraction high-ranking anomalies false positives interesting application perspective. paper, aim make analyst's job easier allowing analyst feedback investigation process. ideally, feedback influences ranking anomaly detector way reduces number false positives must examined discovering anomalies interest. particular, introduce novel technique incorporating simple binary feedback tree-based anomaly detectors. focus isolation forest algorithm representative tree-based anomaly detector, show significantly improve performance incorporating feedback, compared baseline algorithm incorporate feedback. technique simple scales well size data increases, makes suitable interactive discovery anomalies large datasets.",4 "3d convolutional neural networks cross audio-visual matching recognition. audio-visual recognition (avr) considered solution speech recognition tasks audio corrupted, well visual recognition method used speaker verification multi-speaker scenarios. approach avr systems leverage extracted information one modality improve recognition ability modality complementing missing information. essential problem find correspondence audio visual streams, goal work. propose use coupled 3d convolutional neural network (3d-cnn) architecture map modalities representation space evaluate correspondence audio-visual streams using learned multimodal features. proposed architecture incorporate spatial temporal information jointly effectively find correlation temporal information different modalities. using relatively small network architecture much smaller dataset training, proposed method surpasses performance existing similar methods audio-visual matching use 3d cnns feature representation. also demonstrate effective pair selection method significantly increase performance. proposed method achieves relative improvements 20% equal error rate (eer) 7% average precision (ap) comparison state-of-the-art method.",4 "learning probabilistic systems tree samples. consider problem learning non-deterministic probabilistic system consistent given finite set positive negative tree samples. consistency defined respect strong simulation conformance. propose learning algorithms use traditional new ""stochastic"" state-space partitioning, latter resulting minimum number states. use solve problem ""active learning"", uses knowledgeable teacher generate samples counterexamples simulation equivalence queries. show problem undecidable general, becomes decidable suitable condition teacher comes naturally way samples generated failed simulation checks. latter problem shown undecidable impose additional condition learner always conjecture ""minimum state"" hypothesis. therefore propose semi-algorithm using stochastic partitions. finally, apply proposed (semi-) algorithms infer intermediate assumptions automated assume-guarantee verification framework probabilistic systems.",4 "parameterized complexity kernel bounds hard planning problems. propositional planning problem notoriously difficult computational problem. downey et al. (1999) initiated parameterized analysis planning (with plan length parameter) b\""ackstr\""om et al. (2012) picked line research provided extensive parameterized analysis various restrictions, leaving open one stubborn case. continue work provide full classification. particular, show case actions preconditions $e$ postconditions fixed-parameter tractable $e\leq 2$ w[1]-complete otherwise. show fixed-parameter tractability reduction variant steiner tree problem; problem shown fixed-parameter tractable guo et al. (2007). problem fixed-parameter tractable, admits polynomial-time self-reduction instances whose input size bounded function parameter, called kernel. problems, function even polynomial desirable computational implications. recent research parameterized complexity focused classifying fixed-parameter tractable problems whether admit polynomial kernels not. revisit previously obtained restrictions planning fixed-parameter tractable show none admits polynomial kernel unless polynomial hierarchy collapses third level.",4 "use neural networks analysis sleep stages diagnosis narcolepsy. used neural networks ~3,000 sleep recordings 10 locations automate sleep stage scoring, producing probability distribution called hypnodensity graph. accuracy validated 70 subjects scored six technicians (gold standard). best model performed better individual scorer, reaching accuracy 0.87 (and 0.95 predictions weighed scorer agreement). also scores sleep stages 5-second instead conventional 30-second scoring-epochs. accuracy vary sleep disorder except narcolepsy, suggesting scoring difficulties machine and/or humans. narcolepsy biomarker extracted validated 105 type-1 narcoleptics versus 331 controls producing specificity 0.96 sensitivity 0.91. similar performances obtained high pretest probability sample type-2 narcolepsy idiopathic hypersomnia patients. addition hla-dqb1*06:02 increased specificity 0.99. method streamlines scoring diagnoses narcolepsy accurately.",4 "validation soft classification models using partial class memberships: extended concept sensitivity & co. applied grading astrocytoma tissues. use partial class memberships soft classification model uncertain labelling mixtures classes. partial class memberships restricted predictions, may also occur reference labels (ground truth, gold standard diagnosis) training validation data. classifier performance usually expressed fractions confusion matrix, sensitivity, specificity, negative positive predictive values. extend concept soft classification discuss bias variance properties extended performance measures. ambiguity reference labels translates differences best-case, expected worst-case performance. show second set measures comparing expected ideal performance closely related regression performance, namely root mean squared error rmse mean absolute error mae. calculations apply classical crisp classification well soft classification (partial class memberships and/or one-class classifiers). proposed performance measures allow test classifiers actual borderline cases. addition, hardening e.g. posterior probabilities class labels necessary, avoiding corresponding information loss increase variance. implement proposed performance measures r package ""softclassval"", available cran http://softclassval.r-forge.r-project.org. reasoning well importance partial memberships chemometric classification illustrated real-word application: astrocytoma brain tumor tissue grading (80 patients, 37000 spectra) finding surgical excision borders. borderline cases actual target analytical technique, samples diagnosed borderline cases must included validation.",19 "goal-driven query answering existential rules equality. inspired magic sets datalog, present novel goal-driven approach answering queries terminating existential rules equality (aka tgds egds). technique improves performance query answering pruning consequences relevant query. challenging setting equalities potentially affect predicates dataset. address problem combining existing singularization technique two new ingredients: algorithm identifying rules relevant query new magic sets algorithm. show empirically technique significantly improve performance query answering, mean difference answering query seconds able process query all.",4 "beyond word frequency: bursts, lulls, scaling temporal distributions words. background: zipf's discovery word frequency distributions obey power law established parallels biological physical processes, language, laying groundwork complex systems perspective human communication. recent research also identified scaling regularities dynamics underlying successive occurrences events, suggesting possibility similar findings language well. methodology/principal findings: considering frequent words usenet discussion groups disparate databases language different levels formality, show distributions distances successive occurrences word display bursty deviations poisson process well characterized stretched exponential (weibull) scaling. extent deviation depends strongly semantic type -- measure logicality word -- less strongly frequency. develop generative model behavior fully determines dynamics word usage. conclusions/significance: recurrence patterns words well described stretched exponential distribution recurrence times, empirical scaling cannot anticipated zipf's law. use words provides uniquely precise powerful lens human thought activity, findings also implications overt manifestations collective human dynamics.",4 "adaptive optimal online linear regression l1-balls. consider problem online linear regression individual sequences. goal paper forecaster output sequential predictions are, time rounds, almost good ones output best linear predictor given l1-ball r^d. consider cases dimension small large relative time horizon t. first present regret bounds optimal dependencies sizes u, x l1-ball, input data observations. minimax regret shown exhibit regime transition around point = sqrt(t) u x / (2 y). furthermore, present efficient algorithms adaptive, i.e., require knowledge u, x, y, still achieve nearly optimal regret bounds.",19 "perceptual adversarial networks image-to-image transformation. paper, propose principled perceptual adversarial networks (pan) image-to-image transformation tasks. unlike existing application-specific algorithms, pan provides generic framework learning mapping relationship paired images (fig. 1), mapping rainy image de-rained counterpart, object edges photo, semantic labels scenes image, etc. proposed pan consists two feed-forward convolutional neural networks (cnns), image transformation network discriminative network d. combining generative adversarial loss proposed perceptual adversarial loss, two networks trained alternately solve image-to-image transformation tasks. among them, hidden layers output discriminative network upgraded continually automatically discover discrepancy transformed image corresponding ground-truth. simultaneously, image transformation network trained minimize discrepancy explored discriminative network d. adversarial training process, image transformation network continually narrow gap transformed images ground-truth images. experiments evaluated several image-to-image transformation tasks (e.g., image de-raining, image inpainting, etc.) show proposed pan outperforms many related state-of-the-art methods.",4 "lifting deep: convolutional 3d pose estimation single image. propose unified formulation problem 3d human pose estimation single raw rgb image reasons jointly 2d joint estimation 3d pose reconstruction improve tasks. take integrated approach fuses probabilistic knowledge 3d human pose multi-stage cnn architecture uses knowledge plausible 3d landmark locations refine search better 2d locations. entire process trained end-to-end, extremely efficient obtains state- of-the-art results human3.6m outperforming previous approaches 2d 3d errors.",4 "counterfactual language model adaptation suggesting phrases. mobile devices use language models suggest words phrases use text entry. traditional language models based contextual word frequency static corpus text. however, certain types phrases, offered writers suggestions, may systematically chosen often frequency would predict. paper, propose task generating suggestions writers accept, related distinct task making accurate predictions. although task fundamentally interactive, propose counterfactual setting permits offline training evaluation. find even simple language model capture text characteristics improve acceptability.",4 "probabilistic estimation prediction technique dynamic continuous social science models: evolution attitude basque country population towards eta case study. paper, present computational technique deal uncertainty dynamic continuous models social sciences. considering data surveys, method consists determining probability distribution survey output allows sample data fit model sampled data using goodness-of-fit criterion based chi-square-test. taking fitted parameters non-rejected chi-square-test, substituting model computing outputs, build 95% confidence intervals time instant capturing uncertainty survey data (probabilistic estimation). using set obtained model parameters, also provide prediction next years 95% confidence intervals (probabilistic prediction). technique applied dynamic social model describing evolution attitude basque country population towards revolutionary organization eta.",4 "stochastic backpropagation approximate inference deep generative models. marry ideas deep neural networks approximate bayesian inference derive generalised class deep, directed generative models, endowed new algorithm scalable inference learning. algorithm introduces recognition model represent approximate posterior distributions, acts stochastic encoder data. develop stochastic back-propagation -- rules back-propagation stochastic variables -- use develop algorithm allows joint optimisation parameters generative recognition model. demonstrate several real-world data sets model generates realistic samples, provides accurate imputations missing data useful tool high-dimensional data visualisation.",19 "generalizable data-free objective crafting universal adversarial perturbations. machine learning models susceptible adversarial perturbations: small changes input cause large changes output. also demonstrated exist input-agnostic perturbations, called universal adversarial perturbations, change inference target model data samples. however, existing methods craft universal perturbations (i) task specific, (ii) require samples training data distribution, (iii) perform complex optimizations. also, data dependence, fooling ability crafted perturbations proportional available training data. paper, present novel, generalizable data-free objective crafting universal adversarial perturbations. independent underlying task, objective achieves fooling via corrupting extracted features multiple layers. therefore, proposed objective generalizable craft image-agnostic perturbations across multiple vision tasks object recognition, semantic segmentation depth estimation. practical setting black-box attacking scenario, show objective outperforms data dependent objectives fool learned models. further, via exploiting simple priors related data distribution, objective remarkably boosts fooling ability crafted perturbations. significant fooling rates achieved objective emphasize current deep learning models increased risk, since objective generalizes across multiple tasks without requirement training data crafting perturbations.",4 "investigating working text classifiers. text classification one widely studied task natural language processing. recently, larger larger multilayer neural network models employed task motivated principle compositionality. almost methods reported use discriminative approaches task. discriminative approaches come caveat proper capacity control, might latch signal even though might generalize. use various state-of-the-art approaches text classifiers, want explore models actually learn compose meaning sentences still use key lexicons. test hypothesis, construct datasets train test split direct overlap lexicons. study various text classifiers observe big performance drop datasets. finally, show even simple regularization techniques improve performance datasets.",4 "signal recovery graphs: variation minimization. consider problem signal recovery graphs graphs model data complex structure signals graph. graph signal recovery implies recovery one multiple smooth graph signals noisy, corrupted, incomplete measurements. propose graph signal model formulate signal recovery corresponding optimization problem. provide general solution using alternating direction methods multipliers. next show signal inpainting, matrix completion, robust principal component analysis, anomaly detection relate graph signal recovery, provide corresponding specific solutions theoretical analysis. finally, validate proposed methods real-world recovery problems, including online blog classification, bridge condition identification, temperature estimation, recommender system, expert opinion combination online blog classification.",4 "pattern recognition theory mind. propose pattern recognition, memorization processing key concepts principle set theoretical modeling mind function. questions mind functioning answered descriptive modeling definitions principles. understandable consciousness definition drawn based assumption pattern recognition system recognize patterns activity. principles, descriptive modeling definitions basis theoretical applied research cognitive sciences, particularly artificial intelligence studies.",4 "sufficiently fast algorithm finding close optimal junction trees. algorithm developed finding close optimal junction tree given graph g. algorithm worst case complexity o(c^k n^a) c constants, n number vertices, k size largest clique junction tree g size minimized. algorithm guarantees logarithm size state space heaviest clique junction tree produced less constant factor optimal value. k = o(log n), algorithm yields polynomial inference algorithm bayesian networks.",4 "fuzzy object-oriented dynamic networks. ii. article generalizes object-oriented dynamic networks fuzzy case, allows one represent knowledge objects classes objects fuzzy nature also model changes time. within framework approach described, mechanism proposed makes possible acquire new knowledge basis basic knowledge considerably differs well-known methods used existing models knowledge representation. approach illustrated example construction concrete fuzzy object-oriented dynamic network.",4 "investigating evolvability web page load time. client-side javascript execution environments (browsers) allow anonymous functions event-based programming concepts callbacks. investigate whether mutate-and-test approach used optimise web page load time environments. first, characterise web page load issue benchmark web page derive performance metrics page load event traces. parse javascript source code ast make changes method calls appear web page load event trace. present operator based solely code deletion evaluate existing ""community-contributed"" performance optimising code transform. exploring javascript code changes exploiting combinations non-destructive changes, optimise page load time 41% benchmark web page.",4 evaluating informal-domain word representations urbandictionary. existing corpora intrinsic evaluation targeted towards tasks informal domains twitter news comment forums. want test whether representation informal words fulfills promise eliding explicit text normalization preprocessing step. one possible evaluation metric domains proximity spelling variants. propose metric might computed spelling variant dataset collected using urbandictionary.,4 "automatic segmentation retinal vasculature. segmentation retinal vessels retinal fundus images key step automatic retinal image analysis. paper, propose new unsupervised automatic method segment retinal vessels retinal fundus images. contrast enhancement illumination correction carried series image processing steps followed adaptive histogram equalization anisotropic diffusion filtering. image converted gray scale using weighted scaling. vessel edges enhanced boosting detail curvelet coefficients. optic disk pixels removed applying fuzzy c-mean classification avoid misclassification. morphological operations connected component analysis applied obtain segmented retinal vessels. performance proposed method evaluated using drive database able compare state-of-art supervised unsupervised methods. overall segmentation accuracy proposed method 95.18% outperforms algorithms.",4 "joint training deep boltzmann machines. introduce new method training deep boltzmann machines jointly. prior methods require initial learning pass trains deep boltzmann machine greedily, one layer time, perform well classifi- cation tasks.",19 "generating explanations biomedical queries. introduce novel mathematical models algorithms generate (shortest k different) explanations biomedical queries, using answer set programming. implement algorithms integrate bioquery-asp. illustrate usefulness methods complex biomedical queries related drug discovery, biomedical knowledge resources pharmgkb, drugbank, biogrid, ctd, sider, disease ontology orphadata. appear theory practice logic programming (tplp).",4 "iterative bayesian learning crowdsourced regression. crowdsourcing platforms emerged popular venues purchasing human intelligence low cost large volumes tasks. many low-paid workers prone give noisy answers, one fundamental questions identify reliable workers exploit heterogeneity infer true answers accurately. despite significant research efforts classification tasks discrete answers, little attention paid regression tasks continuous answers. popular dawid-skene model discrete answers algorithmic mathematical simplicity relation low-rank structures. generalize continuous valued answers. end, introduce new probabilistic model crowdsourced regression capturing heterogeneity workers, generalizing dawid-skene model continuous domain. design message-passing algorithm bayesian inference inspired popular belief propagation algorithm. showcase performance first proving achieves near optimal mean squared error comparing oracle estimator. asymptotically, provide tighter analysis showing proposed algorithm achieves exact optimal performance. next show synthetic experiments confirming theoretical predictions. practical application, emulate crowdsourcing system reproducing pascal visual object classes datasets show de-noising crowdsourced data proposed scheme significantly improve performance vision task.",4 "image compression svd : new quality metric based energy ratio. digital image compression technique allows reduce size image order increase capacity storage devices optimize use network bandwidth. quality compressed images techniques based discrete cosine transform wavelet transform generally measured psnr ssim. theses metrics suitable images compressed singular values decomposition. paper presents new metric based energy ratio measure quality images coded svd. series tests 512 * 512 pixels images show that, rank k = 40 corresponding ssim = 0,94 psnr = 35 db, 99,9% energy restored. three areas image quality assessments identified. new metric also accurate could overcome weaknesses psnr ssim.",4 "calibration depth cameras using denoised depth images. depth sensing devices created various new applications scientific commercial research advent microsoft kinect pmd (photon mixing device) cameras. applications require depth cameras pre-calibrated. however, traditional calibration methods using checkerboard work well depth cameras due low image resolution. paper, propose depth calibration scheme excels estimating camera calibration parameters handful corners calibration images available. exploit noise properties pmd devices denoise depth measurements perform camera calibration using denoised depth additional set measurements. synthetic real experiments show depth denoising depth based calibration scheme provides significantly better results traditional calibration methods.",4 "interpretable vaes nonlinear group factor analysis. deep generative models recently yielded encouraging results producing subjectively realistic samples complex data. far less attention paid making generative models interpretable. many scenarios, ranging scientific applications finance, observed variables natural grouping. often interest understand systems interaction amongst groups, latent factor models (lfms) attractive approach. however, traditional lfms limited assuming linear correlation structure. present output interpretable vae (oi-vae) grouped data models complex, nonlinear latent-to-observed relationships. combine structured vae comprised group-specific generators sparsity-inducing prior. demonstrate oi-vae yields meaningful notions interpretability analysis motion capture meg data. show situations, regularization inherent oi-vae actually lead improved generalization learned generative processes.",4 "material recognition local appearance global context. recognition materials proven challenging problem due wide variation appearance within categories. global image context, material object makes up, crucial recognizing material. existing methods, however, operate implicit fusion materials context using large receptive fields input (i.e., large image patches). many recent material recognition methods treat materials yet another set labels like objects. materials are, however, fundamentally different objects inherent shape defined spatial extent. approaches ignore take advantage limited implicit context appears training. instead show recognizing materials purely local appearance integrating separately recognized global contextual cues including objects places leads superior dense, per-pixel, material recognition. achieve training fully-convolutional material recognition network end-to-end material category supervision. integrate object place estimates network independent cnns. approach avoids necessity preparing impractically-large amount training data cover product space materials, objects, scenes, fully leveraging contextual cues dense material recognition. furthermore, perform detailed analysis effects context granularity, spatial resolution, network level introduce context. recently introduced comprehensive diverse material database \cite{schwartz2016}, confirm method achieves state-of-the-art accuracy significantly less training data compared past methods.",4 "spike-and-slab sparse coding unsupervised feature discovery. consider problem using factor model call {\em spike-and-slab sparse coding} (s3c) learn features classification task. s3c model resembles spike-and-slab rbm sparse coding. since exact inference model intractable, derive structured variational inference procedure employ variational em training algorithm. prior work approximate inference model prioritized ability exploit parallel architectures scale enormous problem sizes. present inference procedure appropriate use gpus allows us dramatically increase training set size amount latent factors. demonstrate approach improves upon supervised learning capabilities sparse coding ssrbm cifar-10 dataset. evaluate approach's potential semi-supervised learning subsets cifar-10. demonstrate state-of-the art self-taught learning performance stl-10 dataset use method win nips 2011 workshop challenges learning hierarchical models' transfer learning challenge.",19 "featureless 2d-3d pose estimation minimising illumination-invariant loss. problem identifying 3d pose known object given 2d image important applications computer vision ranging robotic vision image analysis. proposed method registering 3d model known object given 2d photo object numerous advantages existing methods: neither require prior training learning, knowledge camera parameters, explicit point correspondences matching features image model. unlike techniques estimate partial 3d pose (as overhead view traffic machine parts conveyor belt), method estimates complete 3d pose object, works single static image given view, varying unknown lighting conditions. purpose derive novel illumination-invariant distance measure 2d photo projected 3d model, minimised find best pose parameters. results vehicle pose detection presented.",4 "probabilistic approach learning folksonomies structured data. learning structured representations emerged important problem many domains, including document web data mining, bioinformatics, image analysis. one approach learning complex structures integrate many smaller, incomplete noisy structure fragments. work, present unsupervised probabilistic approach extends affinity propagation combine small ontological fragments collection integrated, consistent, larger folksonomies. challenging task method must aggregate similar structures avoiding structural inconsistencies handling noise. validate approach real-world social media dataset, comprised shallow personal hierarchies specified many individual users, collected photosharing website flickr. empirical results show proposed approach able construct deeper denser structures, compared approach using standard affinity propagation algorithm. additionally, approach yields better overall integration quality state-of-the-art approach based incremental relational clustering.",4 "reasoning uncertain knowledge. model knowledge representation described propositional facts relationships among supported facts. set knowledge supported called set cognitive units, associated descriptions explicit implicit support structures, summarizing belief reliability belief. summary precise enough useful computational model remaining descriptive underlying symbolic support structure. fact supports another supportive relationship facts call meta-support. facilitates reasoning propositional knowledge. support structures underlying it.",4 "efficient algorithm extremely large multi-task regression massive structured sparsity. develop highly scalable optimization method called ""hierarchical group-thresholding"" solving multi-task regression model complex structured sparsity constraints input output spaces. despite recent emergence several efficient optimization algorithms tackling complex sparsity-inducing regularizers, true scalability practical high-dimensional problems huge amount (e.g., millions) sparsity patterns need enforced remains open challenge, existing algorithms must deal patterns exhaustively every iteration, computationally prohibitive. proposed algorithm addresses scalability problem screening multiple groups coefficients simultaneously systematically. employ hierarchical tree representation group constraints accelerate process removing irrelevant constraints taking advantage inclusion relationships group sparsities, thereby avoiding dealing constraints every optimization step, necessitating optimization operation small number outstanding coefficients. experiments, demonstrate efficiency method simulation datasets, application detecting genetic variants associated gene expression traits.",19 "diving deep sentiment: understanding fine-tuned cnns visual sentiment prediction. visual media powerful means expressing emotions sentiments. constant generation new content social networks highlights need automated visual sentiment analysis tools. convolutional neural networks (cnns) established new state-of-the-art several vision problems, application task sentiment analysis mostly unexplored studies regarding design cnns purpose. work, study suitability fine-tuning cnn visual sentiment prediction well explore performance boosting techniques within deep learning setting. finally, provide deep-dive analysis benchmark, state-of-the-art network architecture gain insight design patterns cnns task visual sentiment prediction.",4 "self-organizing neural network architecture learning human-object interactions. visual recognition transitive actions comprising human-object interactions key component artificial systems operating natural environments. challenging task requires jointly recognition articulated body actions well extraction semantic elements scene identity manipulated objects. paper, present self-organizing neural network recognition human-object interactions rgb-d videos. model consists hierarchy grow-when-required (gwr) networks learn prototypical representations body motion patterns objects, accounting development action-object mappings unsupervised fashion. report experimental results dataset daily activities collected purpose study well publicly available benchmark dataset. line neurophysiological studies, self-organizing architecture exhibits higher neural activation congruent action-object pairs learned training sessions respect synthetically created incongruent ones. show unsupervised model shows competitive classification results benchmark dataset respect strictly supervised approaches.",4 "truncated variance reduction: unified approach bayesian optimization level-set estimation. present new algorithm, truncated variance reduction (truvar), treats bayesian optimization (bo) level-set estimation (lse) gaussian processes unified fashion. algorithm greedily shrinks sum truncated variances within set potential maximizers (bo) unclassified points (lse), updated based confidence bounds. truvar effective several important settings typically non-trivial incorporate myopic algorithms, including pointwise costs heteroscedastic noise. provide general theoretical guarantee truvar covering aspects, use recover strengthen existing results bo lse. moreover, provide new result setting one select number noise levels associated costs. demonstrate effectiveness algorithm synthetic real-world data sets.",19 "grammar induction mildly context sensitive languages using variational bayesian inference. following technical report presents formal approach probabilistic minimalist grammar induction. describe formalization minimalist grammar. based grammar, define generative model minimalist derivations. present generalized algorithm application variational bayesian inference lexicalized mildly context sensitive language grammars paper applied previously defined minimalist grammar.",4 "convex relaxations bregman divergence clustering. although many convex relaxations clustering proposed past decade, current formulations remain restricted spherical gaussian discriminative models susceptible imbalanced clusters. address shortcomings, propose new class convex relaxations flexibly applied general forms bregman divergence clustering. basing new formulations normalized equivalence relations retain additional control relaxation quality, allows improvement clustering quality. furthermore develop optimization methods improve scalability exploiting recent implicit matrix norm methods. practice, find new formulations able efficiently produce tighter clusterings improve accuracy state art methods.",4 "learning deep cnn denoiser prior image restoration. model-based optimization methods discriminative learning methods two dominant strategies solving various inverse problems low-level vision. typically, two kinds methods respective merits drawbacks, e.g., model-based optimization methods flexible handling different inverse problems usually time-consuming sophisticated priors purpose good performance; meanwhile, discriminative learning methods fast testing speed application range greatly restricted specialized task. recent works revealed that, aid variable splitting techniques, denoiser prior plugged modular part model-based optimization methods solve inverse problems (e.g., deblurring). integration induces considerable advantage denoiser obtained via discriminative learning. however, study integration fast discriminative denoiser prior still lacking. end, paper aims train set fast effective cnn (convolutional neural network) denoisers integrate model-based optimization method solve inverse problems. experimental results demonstrate learned set denoisers achieve promising gaussian denoising results also used prior deliver good performance various low-level vision applications.",4 "stochastic expectation propagation large scale gaussian process classification. method large scale gaussian process classification recently proposed based expectation propagation (ep). method allows gaussian process classifiers trained large datasets reach previous deployments ep shown competitive related techniques based stochastic variational inference. nevertheless, memory resources required scale linearly dataset size, unlike variational methods. severe limitation number instances large. show problem avoided stochastic ep used train model.",19 "causal transfer machine learning. methods transfer learning try combine knowledge several related tasks (or domains) improve performance test task. inspired causal methodology, relax usual covariate shift assumption assume holds true subset predictor variables: conditional distribution target variable given subset predictors invariant tasks. show assumption motivated ideas field causality. prove adversarial setting using subset prediction optimal examples test task observed; provide examples, tasks sufficiently diverse estimator therefore outperforms pooling data, even average. examples test task available, provide method transfer knowledge training tasks exploit available features prediction. introduce practical method allows automatic inference subset provide corresponding code. present results synthetic data sets gene deletion data set.",19 "high-level model neocortical feedback based event window segmentation algorithm. author previously presented event window segmentation (ews) algorithm [5] uses purely statistical methods learn recognize recurring patterns input stream events. following discussion, ews algorithm first extended make predictions future events. next, extended algorithm used construct high-level, simplified model neocortical hierarchy. event stream enters bottom hierarchy, drives processing activity upward hierarchy. successively higher regions hierarchy learn recognize successively deeper levels patterns events propagate bottom hierarchy. lower levels hierarchy use predictions levels strengthen predictions. c++ source code listing model implementation test program included appendix.",4 "detection classification masses mammographic images multi-kernel approach. according world health organization, breast cancer main cause cancer death among adult women world. although breast cancer occurs indiscriminately countries several degrees social economic development, among developing underdevelopment countries mortality rates still high, due low availability early detection technologies. clinical point view, mammography still effective diagnostic technology, given wide diffusion use interpretation images. herein work propose method detect classify mammographic lesions using regions interest images. proposal consists decomposing image using multi-resolution wavelets. zernike moments extracted wavelet component. using approach combine texture shape features, applied detection classification mammary lesions. used 355 images fatty breast tissue irma database, 233 normal instances (no lesion), 72 benign, 83 malignant cases. classification performed using svm elm networks modified kernels, order optimize accuracy rates, reaching 94.11%. considering accuracy rates training times, defined ration average percentage accuracy average training time reverse order. proposal 50 times higher ratio obtained using best method state-of-the-art. proposed model combine high accuracy rate low learning time, whenever new data received, work able save lot time, hours, learning process relation best method state-of-the-art.",4 "information spike timing: neural codes derived polychronous groups. growing evidence regarding importance spike timing neural information processing, even small number spikes carrying information, computational models lag significantly behind rate coding. experimental evidence neuronal behavior consistent dynamical state dependent behavior provided recurrent connections. motivates minimalistic abstraction investigated paper, aimed providing insight information encoding spike timing via recurrent connections. employ information-theoretic techniques simple reservoir model encodes input spatiotemporal patterns sparse neural code, translating polychronous groups introduced izhikevich codewords perform standard vector operations. show distance properties code similar (optimal) random codes. particular, code meets benchmarks associated linear classification capacity, latter scaling exponentially reservoir size.",16 "mixed strategy may outperform pure strategy: initial study. pure strategy meta-heuristics, one search strategy applied time. mixed strategy meta-heuristics, time one search strategy chosen strategy pool probability applied. example classical genetic algorithms, either mutation crossover operator chosen probability time. aim paper compare performance mixed strategy pure strategy meta-heuristic algorithms. first experimental study implemented results demonstrate mixed strategy evolutionary algorithms may outperform pure strategy evolutionary algorithms 0-1 knapsack problem 77.8% instances. complementary strategy theorem rigorously proven applying mixed strategy population level. theorem asserts given two meta-heuristic algorithms one uses pure strategy 1 another uses pure strategy 2, condition pure strategy 2 complementary pure strategy 1 sufficient necessary exists mixed strategy meta-heuristics derived two pure strategies expected number generations find optimal solution using pure strategy 1 initial population, less using pure strategy 1 initial population.",4 "cafe wall illusion: local global perception multiple scale multiscale. geometrical illusions subclass optical illusions geometrical characteristics patterns orientations angles distorted misperceived result low- high-level retinal/cortical processing. modelling detection tilt illusions strengths perceived challenging task computationally leads development techniques match human performance. study, present predictive quantitative approach modeling foveal peripheral vision induced tilt caf\'e wall illusion parallel mortar lines shifted rows black white tiles appear converge diverge. bioderived filtering model responses retinal/cortical simple cells stimulus using difference gaussians utilized analytic processing pipeline introduced previous studies quantify angle tilt model. considered visual characteristics foveal peripheral vision perceived tilt pattern predict different degrees tilt different areas fovea periphery eye saccades different parts image. tilt analysis results several sampling sizes aspect ratios, modelling variant foveal views used previous investigations local tilt, specifically investigate work, different configurations whole pattern modelling variant gestalt views across multiple scales order provide confidence intervals around predicted tilts. foveal sample sets verified quantified using two different sampling methods. present precise quantified comparison contrasting local tilt detection foveal sets global average across caf\'e wall configurations tested work.",4 "functional correspondence matrix completion. paper, consider problem finding dense intrinsic correspondence manifolds using recently introduced functional framework. pose functional correspondence problem matrix completion manifold geometric structure inducing functional localization $l_1$ norm. discuss efficient numerical procedures solution problem. method compares favorably accuracy state-of-the-art correspondence algorithms non-rigid shape matching benchmarks, especially advantageous settings scarce data available.",4 "prediction control temporal segment models. introduce method learning dynamics complex nonlinear systems based deep generative models temporal segments states actions. unlike dynamics models operate individual discrete timesteps, learn distribution future state trajectories conditioned past state, past action, planned future action trajectories, well latent prior action trajectories. approach based convolutional autoregressive models variational autoencoders. makes stable accurate predictions long horizons complex, stochastic systems, effectively expressing uncertainty modeling effects collisions, sensory noise, action delays. learned dynamics model action prior used end-to-end, fully differentiable trajectory optimization model-based policy optimization, use evaluate performance sample-efficiency method.",4 "inaturalist challenge 2017 dataset. existing image classification datasets used computer vision tend even number images object category. contrast, natural world heavily imbalanced, species abundant easier photograph others. encourage progress challenging real world conditions present inaturalist challenge 2017 dataset - image classification benchmark consisting 675,000 images 5,000 different species plants animals. features many visually similar species, captured wide variety situations, world. images collected different camera types, varying image quality, verified multiple citizen scientists, feature large class imbalance. discuss collection dataset present baseline results state-of-the-art computer vision classification models. results show current non-ensemble based methods achieve 64% top one classification accuracy, illustrating difficulty dataset. finally, report results competition held data.",4 "triplet-based deep similarity learning person re-identification. recent years, person re-identification (re-id) catches great attention computer vision community industry. paper, propose new framework person re-identification triplet-based deep similarity learning using convolutional neural networks (cnns). network trained triplet input: two class labels one different. aims learn deep feature representation, distance within class decreased, distance different classes increased much possible. moreover, trained model jointly six different datasets, differs common practice - one model trained one dataset tested also one. however, enormous number possible triplet data among large number training samples makes training impossible. address challenge, double-sampling scheme proposed generate triplets images effective possible. proposed framework evaluated several benchmark datasets. experimental results show that, method effective task person re-identification comparable even outperforms state-of-the-art methods.",4 "method aspect-based sentiment annotation using rhetorical analysis. paper fills gap aspect-based sentiment analysis aims present new method preparing analysing texts concerning opinion generating user-friendly descriptive reports natural language. present comprehensive set techniques derived rhetorical structure theory sentiment analysis extract aspects textual opinions build abstractive summary set opinions. moreover, propose aspect-aspect graphs evaluate importance aspects filter unimportant ones summary. additionally, paper presents prototype solution data flow interesting valuable results. proposed method's results proved high accuracy aspect detection applied gold standard dataset.",4 "personabank: corpus personal narratives story intention graphs. present new corpus, personabank, consisting 108 personal stories weblogs annotated story intention graphs, deep representation fabula story. describe topics stories basis story intention graph representation, well process annotating stories produce story intention graphs challenges adapting tool new personal narrative domain also discuss corpus used applications retell story using different styles tellings, co-tellings, content planner.",4 "look once: unified, real-time object detection. present yolo, new approach object detection. prior work object detection repurposes classifiers perform detection. instead, frame object detection regression problem spatially separated bounding boxes associated class probabilities. single neural network predicts bounding boxes class probabilities directly full images one evaluation. since whole detection pipeline single network, optimized end-to-end directly detection performance. unified architecture extremely fast. base yolo model processes images real-time 45 frames per second. smaller version network, fast yolo, processes astounding 155 frames per second still achieving double map real-time detectors. compared state-of-the-art detection systems, yolo makes localization errors far less likely predict false detections nothing exists. finally, yolo learns general representations objects. outperforms detection methods, including dpm r-cnn, wide margin generalizing natural images artwork picasso dataset people-art dataset.",4 "state estimation non-gaussian levy noise: modified kalman filtering method. kalman filter extensively used state estimation linear systems gaussian noise. non-gaussian l\'evy noise present, conventional kalman filter may fail effective due fact non-gaussian l\'evy noise may infinite variance. modified kalman filter linear systems non-gaussian l\'evy noise devised. works effectively reasonable computational cost. simulation results presented illustrate non-gaussian filtering method.",12 "bio-inspired computation: success challenges ijbic. five years since launch international journal bio-inspired computation (ijbic). time, significant new progress made area bio-inspired computation. review paper summarizes success achievements ijbic past five years, also highlights challenges key issues research.",12 "optimal cluster recovery labeled stochastic block model. consider problem community detection clustering labeled stochastic block model (lsbm) finite number $k$ clusters sizes linearly growing global population items $n$. every pair items labeled independently random, label $\ell$ appears probability $p(i,j,\ell)$ two items clusters indexed $i$ $j$, respectively. objective reconstruct clusters observation random labels. clustering sbm extensions attracted much attention recently. existing work aimed characterizing set parameters possible infer clusters either positively correlated true clusters, vanishing proportion misclassified items, exactly matching true clusters. find set parameters exists clustering algorithm $s$ misclassified items average general lsbm $s=o(n)$, solves one open problem raised \cite{abbe2015community}. develop algorithm, based simple spectral methods, achieves fundamental performance limit within $o(n \mbox{polylog}(n))$ computations without a-priori knowledge model parameters.",12 "brownian motion model extreme belief machine modeling sensor data measurements. title suggests, describe (and justify presentation relevant mathematics) prediction methodologies sensor measurements. exposition mainly concerned mathematics related modeling sensor measurements.",4 "convergence emphatic temporal-difference learning. consider emphatic temporal-difference learning algorithms policy evaluation discounted markov decision processes finite spaces. algorithms recently proposed sutton, mahmood, white (2015) improved solution problem divergence off-policy temporal-difference learning linear function approximation. present paper first convergence proofs two emphatic algorithms, etd($\lambda$) elstd($\lambda$). prove, general off-policy conditions, convergence $l^1$ elstd($\lambda$) iterates, almost sure convergence approximate value functions calculated algorithms using single infinitely long trajectory. analysis involves new techniques applications beyond emphatic algorithms leading, example, first proof standard td($\lambda$) also converges off-policy training $\lambda$ sufficiently large.",4 "seeing there: learning context determine objects missing. computer vision focuses image. propose train standalone object-centric context representation perform opposite task: seeing there. given image, context model predict objects exist, even object instances present. combined object detection results, perform novel vision task: finding objects missing image. model based convolutional neural network structure. specially designed training strategy, model learns ignore objects focus context only. fully convolutional thus highly efficient. experiments show effectiveness proposed approach one important accessibility task: finding city street regions curb ramps missing, could help millions people mobility disabilities.",4 "post-proceedings first international workshop learning nonmonotonic reasoning. knowledge representation reasoning machine learning two important fields ai. nonmonotonic logic programming (nmlp) answer set programming (asp) provide formal languages representing reasoning commonsense knowledge realize declarative problem solving ai. side, inductive logic programming (ilp) realizes machine learning logic programming, provides formal background inductive learning techniques applied fields relational learning data mining. generally speaking, nmlp asp realize nonmonotonic reasoning lack ability learning. contrast, ilp realizes inductive learning techniques developed classical monotonic logic. background, researchers attempt combine techniques context nonmonotonic ilp. combination introduce learning mechanism programs would exploit new applications nmlp side, ilp side extend representation language enable us use existing solvers. cross-fertilization learning nonmonotonic reasoning also occur use answer set solvers ilp, speed-up learning running answer set solvers, learning action theories, learning transition rules dynamical systems, abductive learning, learning biological networks inhibition, applications involving default negation. workshop first attempt provide open forum identification problems discussion possible collaborations among researchers complementary expertise. workshop held september 15th 2013 corunna, spain. post-proceedings contains five technical papers (out six accepted papers) abstract invited talk luc de raedt.",4 "risk estimation matrix recovery spectral regularization. paper, develop approach recursively estimate quadratic risk matrix recovery problems regularized spectral functions. toward end, spirit sure theory, key step compute (weak) derivative divergence solution respect observations. solution available closed form, rather proximal splitting algorithm, propose recursively compute divergence sequence iterates. second challenge unlocked computation (weak) derivative proximity operator spectral function. show potential applicability approach, exemplify matrix completion problem objectively automatically select regularization parameter.",12 "continuous action recognition based sequence alignment. continuous action recognition challenging isolated recognition classification segmentation must simultaneously carried out. build well known dynamic time warping (dtw) framework devise novel visual alignment technique, namely dynamic frame warping (dfw), performs isolated recognition based per-frame representation videos, aligning test sequence model sequence. moreover, propose two extensions enable perform recognition concomitant segmentation, namely one-pass dfw two-pass dfw. two methods roots domain continuous recognition speech and, best knowledge, extension continuous visual action recognition overlooked. test illustrate proposed techniques recently released dataset (ravel) two public-domain datasets widely used action recognition (hollywood-1 hollywood-2). also compare performances proposed isolated continuous recognition algorithms several recently published methods.",4 "optimization jaccard index image segmentation lovász hinge. jaccard loss, commonly referred intersection-over-union loss, commonly employed evaluation segmentation quality due better perceptual quality scale invariance, lends appropriate relevance small objects compared per-pixel losses. present method direct optimization per-image intersection-over-union loss neural networks, context semantic image segmentation, based convex surrogate: lov\'asz hinge. loss shown perform better respect jaccard index measure losses traditionally used context semantic segmentation; cross-entropy. develop specialized optimization method, based efficient computation proximal operator lov\'asz hinge, yielding reliably faster stable optimization alternatives. demonstrate effectiveness method showing substantially improved intersection-overunion segmentation scores pascal voc dataset using state-of-the-art deep learning segmentation architecture.",4 "positive-unlabeled convolutional neural networks particle picking cryo-electron micrographs. cryo-electron microscopy (cryoem) fast becoming preferred method protein structure determination. particle picking significant bottleneck solving protein structures single particle cryoem. hand labeling sufficient numbers particles take months effort current computationally based approaches often ineffective. here, frame particle picking positive-unlabeled classification problem seek learn convolutional neural network (cnn) classify micrograph regions particle background small number labeled positive examples many unlabeled examples. however, model fitting labeled data points challenging machine learning problem. address this, develop novel objective function, ge-binomial, learning model parameters context. objective uses newly-formulated generalized expectation criteria learn effectively unlabeled data using minibatched stochastic gradient descent optimizers. high-quality publicly available cryoem dataset difficult unpublished dataset supplied shapiro lab, show cnns trained objective classify particles accurately positive training examples outperform eman2's byref method large margin even fewer labeled training examples. furthermore, show incorporating autoencoder improves generalization labeled data points available. also compare ge-binomial method positive-unlabeled learning methods never applied particle picking. expect particle picking tool, topaz, based cnns trained ge-binomial, essential component single particle cryoem analysis ge-binomial objective function widely applicable positive-unlabeled classification problems.",16 emotional metaheuristics in-situ foraging using sensor constrained robot swarms. present new social animal inspired emotional swarm intelligence technique. technique used solve variant popular collective robots problem called foraging. show simulation study simple interaction rules based sensations like hunger loneliness lead globally coherent emergent behavior allows sensor constrained robots solve given problem,4 "schoenberg transformations data analysis: theory illustrations. class schoenberg transformations, embedding euclidean distances higher dimensional euclidean spaces, presented, derived theorems positive definite conditionally negative definite matrices. original results arc lengths, angles curvature transformations proposed, visualized artificial data sets classical multidimensional scaling. simple distance-based discriminant algorithm illustrates theory, intimately connected gaussian kernels machine learning.",19 "modeling item-difficulty ontology-based mcqs. multiple choice questions (mcqs) generated domain ontology significantly reduce human effort & time required authoring & administering assessments e-learning environment. even though various methods generating mcqs ontologies, methods determining difficulty-levels mcqs less explored. paper, study various aspects factors involved determining difficulty-score mcq, propose ontology-based model prediction. model characterizes difficulty values associated stem choice set mcqs, describes measure combines scores. more, notion assigning difficultly-scores based skill level test taker utilized predicating difficulty-score stem. studied effectiveness predicted difficulty-scores help psychometric model item response theory, involving real-students domain experts. results show that, predicated difficulty-levels mcqs high correlation actual difficulty-levels.",4 "ten years pedestrian detection, learned?. paper-by-paper results make easy miss forest trees.we analyse remarkable progress last decade discussing main ideas explored 40+ detectors currently present caltech pedestrian detection benchmark. observe exist three families approaches, currently reaching similar detection quality. based analysis, study complementarity promising ideas combining multiple published strategies. new decision forest detector achieves current best known performance challenging caltech-usa dataset.",4 "discourse obligations dialogue processing. show modeling social interaction, particularly dialogue, attitude obligation useful adjunct popularly considered attitudes belief, goal, intention mutual shared counterparts. particular, show discourse obligations used account natural manner connection question answer dialogue obligations used along parts discourse context extend coverage dialogue system.",2 "preprint arpps augmented reality pipeline prospect system. preprint version paper iconip. outdoor augmented reality geographic information system (argis) hot application augmented reality recent years. paper concludes key solutions argis, designs mobile augmented reality pipeline prospect system (arpps), respectively realizes machine vision based pipeline prospect system (mvbpps) sensor based pipeline prospect system (sbpps). mvbpps's realization, paper studies neural network based 3d features matching method.",4 "logic-based clustering learning time-series data. effectively analyze design cyberphysical systems (cps), designers today combat data deluge problem, i.e., burden processing intractably large amounts data produced complex models experiments. work, utilize monotonic parametric signal temporal logic (pstl) design features unsupervised classification time series data. enables using off-the-shelf machine learning tools automatically cluster similar traces respect given pstl formula. demonstrate technique produces interpretable formulas amenable analysis understanding using representative examples. illustrate case studies related automotive engine testing, highway traffic analysis, auto-grading massively open online courses.",4 "signer-independent fingerspelling recognition deep neural network adaptation. study problem recognition fingerspelled letter sequences american sign language signer-independent setting. fingerspelled sequences challenging important recognize, used many content words proper nouns technical terms. previous work shown possible achieve almost 90% accuracies fingerspelling recognition signer-dependent setting. however, realistic signer-independent setting presents challenges due significant variations among signers, coupled dearth available training data. investigate problem approaches inspired automatic speech recognition. start best-performing approaches prior work, based tandem models segmental conditional random fields (scrfs), features based deep neural network (dnn) classifiers letters phonological features. using dnn adaptation, find possible bridge large part gap signer-dependent signer-independent performance. using 115 transcribed words adaptation target signer, obtain letter accuracies 82.7% frame-level adaptation labels 69.7% word labels.",4 "efficiently sampling multiplicative attribute graphs using ball-dropping process. introduce novel efficient sampling algorithm multiplicative attribute graph model (magm - kim leskovec (2010)}). algorithm \emph{strictly} efficient algorithm proposed yun vishwanathan (2012), sense method extends \emph{best} time complexity guarantee algorithm larger fraction parameter space. theory empirical evaluation sparse graphs, new algorithm outperforms previous one. design algorithm, first define stochastic \emph{ball-dropping process} (bdp). although special case process introduced efficient approximate sampling algorithm kronecker product graph model (kpgm - leskovec et al. (2010)}), neither \emph{why} approximation works \emph{what} actual distribution process sampling addressed far best knowledge. rigorous treatment bdp enables us clarify rational behind bdp approximation kpgm, design efficient sampling algorithm magm.",19 "predicting energy output wind farms based weather data: important variables correlation. wind energy plays increasing role supply energy world-wide. energy output wind farm highly dependent weather condition present wind farm. output predicted accurately, energy suppliers coordinate collaborative production different energy sources efficiently avoid costly overproductions. paper, take computer science perspective energy prediction based weather data analyze important parameters well correlation energy output. deal interaction different parameters use symbolic regression based genetic programming tool datamodeler. studies carried publicly available weather energy data wind farm australia. reveal correlation different variables energy output. model obtained energy prediction gives reliable prediction energy output newly given weather data.",4 "cutset sampling bayesian networks. paper presents new sampling methodology bayesian networks samples subset variables applies exact inference rest. cutset sampling network structure-exploiting application rao-blackwellisation principle sampling bayesian networks. improves convergence exploiting memory-based inference algorithms. also viewed anytime approximation exact cutset-conditioning algorithm developed pearl. cutset sampling implemented efficiently sampled variables constitute loop-cutset bayesian network and, generally, induced width networks graph conditioned observed sampled variables bounded constant w. demonstrate empirically benefit scheme range benchmarks.",4 "reinforcement learning approach parallelization filters aggregation based feature selection algorithms. one classical problems machine learning data mining feature selection. feature selection algorithm expected quick, time show high performance. melif algorithm effectively solves problem using ensembles ranking filters. article describes two different ways improve melif algorithm performance parallelization. experiments show proposed schemes significantly improves algorithm performance increase feature selection quality.",4 "coherent integration databases abductive logic programming. introduce abductive method coherent integration independent data-sources. idea compute list data-facts inserted amalgamated database retracted order restore consistency. method implemented abductive solver, called asystem, applies sldnfa-resolution meta-theory relates different, possibly contradicting, input databases. also give pure model-theoretic analysis possible ways `recover' consistent data inconsistent database terms models database exhibit minimal inconsistent information reasonably possible. allows us characterize `recovered databases' terms `preferred' (i.e., consistent) models theory. outcome abductive-based application sound complete respect corresponding model-based, preferential semantics, -- best knowledge -- expressive (thus general) implementation coherent integration databases.",4 "towards quantification semantic information encoded written language. written language complex communication signal capable conveying information encoded form ordered sequences words. beyond local order ruled grammar, semantic thematic structures affect long-range patterns word usage. here, show direct application information theory quantifies relationship statistical distribution words semantic content text. show characteristic scale, roughly around thousand words, establishes typical size informative segments written language. moreover, find words whose contributions overall information larger, ones closely associated main subjects topics text. scenario explained model word usage assumes words distributed along text domains characteristic size frequency higher elsewhere. conclusions based analysis large database written language, diverse subjects styles, thus likely applicable general language sequences encoding complex information.",15 "sharesnet: reducing residual network parameter number sharing weights. deep residual networks reached state art many image processing tasks image classification. however, cost gain accuracy terms depth memory prohibitive requires higher number residual blocks, double initial value. tackle problem, propose paper way reduce redundant information networks. share weights convolutional layers residual blocks operating spatial scale. signal flows multiple times convolutional layer. resulting architecture, called sharesnet, contains block specific layers shared layers. sharesnet trained exactly fashion commonly used residual networks. show, one hand, almost efficient sequential counterparts involving less parameters, hand efficient residual network number parameters. example, 152-layer-deep residual network reduced 106 convolutional layers, i.e. parameter gain 39\%, loosing less 0.2\% accuracy imagenet.",4 "posterior sampling better optimism reinforcement learning?. computational results demonstrate posterior sampling reinforcement learning (psrl) dramatically outperforms algorithms driven optimism, ucrl2. provide insight extent performance boost phenomenon drives it. leverage insight establish $\tilde{o}(h\sqrt{sat})$ bayesian expected regret bound psrl finite-horizon episodic markov decision processes, $h$ horizon, $s$ number states, $a$ number actions $t$ time elapsed. improves upon best previous bound $\tilde{o}(h \sqrt{at})$ reinforcement learning algorithm.",19 "learning shared representations multi-task reinforcement learning. investigate paradigm multi-task reinforcement learning (mt-rl) agent placed environment needs learn perform series tasks, within space. since environment change, potentially lot common ground amongst tasks learning solve individually seems extremely wasteful. paper, explicitly model learn shared structure arises state-action value space. show one jointly learn optimal value-functions modifying popular value-iteration policy-iteration procedures accommodate shared representation assumption leverage power multi-task supervised learning. finally, demonstrate proposed model training procedures, able infer good value functions, even low samples regimes. addition data efficiency, show analysis, learning abstractions state space jointly across tasks leads robust, transferable representations potential better generalization. shared representation assumption leverage power multi-task supervised learning. finally, demonstrate proposed model training procedures, able infer good value functions, even low samples regimes. addition data efficiency, show analysis, learning abstractions state space jointly across tasks leads robust, transferable representations potential better generalization.",4 "cross-situational supervised learning emergence communication. scenarios emergence bootstrap lexicon involve repeated interaction least two agents must reach consensus name n objects using h words. consider minimal models two types learning algorithms: cross-situational learning, individuals determine meaning word looking something common across observed uses word, supervised operant conditioning learning, strong feedback individuals intended meaning words. despite stark differences learning schemes, show yield communication accuracy realistic limits large n h, coincides result classical occupancy problem randomly assigning n objects h words.",4 "crowd flow prediction deep spatio-temporal transfer learning. crowd flow prediction fundamental urban computing problem. recently, deep learning successfully applied solve problem, relies rich historical data. reality, many cities may suffer data scarcity issue targeted service infrastructure new. overcome issue, paper proposes novel deep spatio-temporal transfer learning framework, called regiontrans, predict future crowd flow data-scarce (target) city transferring knowledge data-rich (source) city. leveraging social network check-ins, regiontrans first links region target city certain regions source city, expecting inter-city region pairs share similar crowd flow dynamics. then, propose deep spatio-temporal neural network structure, hidden layer dedicated keeping region representation. source city model trained rich historical data network structure. finally, propose region-based cross-city transfer learning algorithm learn target city model source city model minimizing hidden representation discrepancy inter-city region pairs previously linked check-ins. experiments real crowd flow, regiontrans outperform state-of-the-arts reducing 10.7% prediction error.",4 "cbas: context based arabic stemmer. arabic morphology encapsulates many valuable features word root. arabic roots utilized many tasks; process extracting word root referred stemming. stemming essential part natural language processing tasks, especially derivative languages arabic. however, stemming faced problem ambiguity, two roots could extracted word. hand, distributional semantics powerful co-occurrence model. captures meaning word based context. paper, distributional semantics model utilizing smoothed pointwise mutual information (spmi) constructed investigate effectiveness stemming analysis task. showed accuracy 81.5%, least 9.4% improvement stemmers.",4 "meta-unsupervised-learning: supervised approach unsupervised learning. introduce new paradigm investigate unsupervised learning, reducing unsupervised learning supervised learning. specifically, mitigate subjectivity unsupervised decision-making leveraging knowledge acquired prior, possibly heterogeneous, supervised learning tasks. demonstrate versatility framework via comprehensive expositions detailed experiments several unsupervised problems (a) clustering, (b) outlier detection, (c) similarity prediction common umbrella meta-unsupervised-learning. also provide rigorous pac-agnostic bounds establish theoretical foundations framework, show framing meta-clustering circumvents kleinberg's impossibility theorem clustering.",4 "efficient dual approach distance metric learning. distance metric learning fundamental interest machine learning distance metric employed significantly affect performance many learning methods. quadratic mahalanobis metric learning popular approach problem, typically requires solving semidefinite programming (sdp) problem, computationally expensive. standard interior-point sdp solvers typically complexity $o(d^{6.5})$ (with $d$ dimension input data), thus practically solve problems exhibiting less thousand variables. since number variables $d (d+1) / 2 $, implies limit upon size problem practically solved around hundred dimensions. complexity popular quadratic mahalanobis metric learning approach thus limits size problem metric learning applied. propose significantly efficient approach metric learning problem based lagrange dual formulation problem. proposed formulation much simpler implement, therefore allows much larger mahalanobis metric learning problems solved. time complexity proposed method $o (d ^ 3) $, significantly lower sdp approach. experiments variety datasets demonstrate proposed method achieves accuracy comparable state-of-the-art, applicable significantly larger problems. also show proposed method applied solve general frobenius-norm regularized sdp problems approximately.",4 "interpnet: neural introspection interpretable deep learning. humans able explain reasoning. contrary, deep neural networks not. paper attempts bridge gap introducing new way design interpretable neural networks classification, inspired physiological evidence human visual system's inner-workings. paper proposes neural network design paradigm, termed interpnet, combined existing classification architecture generate natural language explanations classifications. success module relies assumption network's computation reasoning represented internal layer activations. principle interpnet could applied existing classification architecture, evaluated via image classification explanation task. experiments cub bird classification explanation dataset show qualitatively quantitatively model able generate high-quality explanations. current state-of-the-art meteor score dataset 29.2, interpnet achieves much higher meteor score 37.9.",19 "hitchhiker's guide search-based software engineering software product lines. search based software engineering (sbse) emerging discipline focuses application search-based optimization techniques software engineering problems. capacity sbse techniques tackle problems involving large search spaces make application attractive software product lines (spls). recent years, several publications appeared apply sbse techniques spl problems. paper, present results systematic mapping study publications. identified stages spl life cycle sbse techniques used, case studies employed analysed. mapping study revealed potential venues research well common misunderstanding pitfalls applying sbse techniques address providing guideline researchers practitioners interested exploiting techniques.",4 "using belief theory diagnose control knowledge quality. application cartographic generalisation. humans artificial systems frequently use trial error methods problem solving. order effective, type strategy implies high quality control knowledge guide quest optimal solution. unfortunately, control knowledge rarely perfect. moreover, artificial systems-as humans-self-evaluation one's knowledge often difficult. yet, self-evaluation useful manage knowledge determine revise it. objective work propose automated approach evaluate quality control knowledge artificial systems based specific trial error strategy, namely informed tree search strategy. revision approach consists analysing system's execution logs, using belief theory evaluate global quality knowledge. present real-world industrial application form experiment using approach domain cartographic generalisation. thus far, results using approach encouraging.",4 "refined lower bounds adversarial bandits. provide new lower bounds regret must suffered adversarial bandit algorithms. new results show recent upper bounds either (a) hold high-probability (b) depend total lossof best arm (c) depend quadratic variation losses, close tight. besides prove two impossibility results. first, existence single arm optimal every round cannot improve regret worst case. second, regret cannot scale effective range losses. contrast, results possible full-information setting.",12 "making neural programming architectures generalize via recursion. empirically, neural networks attempt learn programs data exhibited poor generalizability. moreover, traditionally difficult reason behavior models beyond certain level input complexity. order address issues, propose augmenting neural architectures key abstraction: recursion. application, implement recursion neural programmer-interpreter framework four tasks: grade-school addition, bubble sort, topological sort, quicksort. demonstrate superior generalizability interpretability small amounts training data. recursion divides problem smaller pieces drastically reduces domain neural network component, making tractable prove guarantees overall system's behavior. experience suggests order neural architectures robustly learn program semantics, necessary incorporate concept like recursion.",4 "bayesian inference radar imagery based surveillance. interested creating automated semi-automated system capability taking set radar imagery, collection parameters priori map tactical data, producing likely interpretations possible military situations given available evidence. paper concerned problem interpretation computation certainty belief conclusions reached system.",4 "still evolutionary algorithms perl. algorithm::evolutionary (a::e on) introduced 2002, talk yapc::eu munich. 7 years later, a::e 0.67 version (past ""number beast"" 0.666), used extensively, point foundation much (computer) science done research group (and, admittedly, many others). done, however; a::e integrated poe evolutionary algorithms (eas) combined kinds servers used client, servers, anything between. companion talk explain evolutionary algorithms are, used for, perl (using fine modules found cpan) evolutionary algorithms perl large.",4 "effective simple ordinal peer grading be?. ordinal peer grading proposed simple scalable solution computing reliable information student performance massive open online courses. idea outsource grading task students follows. end exam, student asked rank ---in terms quality--- bundle exam papers fellow students. aggregation rule combine individual rankings global one contains students. define broad class simple aggregation rules present theoretical framework assessing effectiveness. statistical information grading behaviour students available, framework used compute optimal rule class respect series performance objectives. example, natural rule known borda proved optimal students grade correctly. addition, present extensive simulations field experiment validate theory prove extremely accurate predicting performance aggregation rules even rough information grading behaviour available.",4 "possibilistic fuzzy local information c-means sonar image segmentation. side-look synthetic aperture sonar (sas) produce high quality images sea-floor. viewing imagery, human observer often easily identify various sea-floor textures sand ripple, hard-packed sand, sea grass rock. paper, present possibilistic fuzzy local information c-means (pflicm) approach segment sas imagery sea-floor regions exhibit various natural textures. proposed pflicm method incorporates fuzzy possibilistic clustering methods leverages (local) spatial information perform soft segmentation. results shown several sas scenes compared alternative segmentation approaches.",4 "semantics probabilistic inference. number writers(joseph halpern fahiem bacchus among them) offered semantics formal languages inferences concerning probabilities made. concern different. paper provides formalization nonmonotonic inferences conclusion supported certain degree. inferences clearly 'invalid' since must allow falsity conclusion even premises true. nevertheless, inferences characterized syntactically semantically. 'premises' probabilistic arguments sets statements (as database knowledge base), conclusions categorical statements language. provide standards form inference, high probability required, inference conclusion qualified intermediate interval support.",4 "image type water meter character recognition based embedded dsp. paper, combined dsp processor image processing algorithm studied method water meter character recognition. collected water meter image camera fixed angle, projection method used recognize digital images. experiment results show method recognize meter characters accurately artificial meter reading replaced automatic digital recognition, improves working efficiency.",4 "speeding latent variable gaussian graphical model estimation via nonconvex optimizations. study estimation latent variable gaussian graphical model (lvggm), precision matrix superposition sparse matrix low-rank matrix. order speed estimation sparse plus low-rank components, propose sparsity constrained maximum likelihood estimator based matrix factorization, efficient alternating gradient descent algorithm hard thresholding solve it. algorithm orders magnitude faster convex relaxation based methods lvggm. addition, prove algorithm guaranteed linearly converge unknown sparse low-rank components optimal statistical precision. experiments synthetic genomic data demonstrate superiority algorithm state-of-the-art algorithms corroborate theory.",19 "long short-term memory-networks machine reading. paper address question render sequence-level networks better handling structured input. propose machine reading simulator processes text incrementally left right performs shallow reasoning memory attention. reader extends long short-term memory architecture memory network place single memory cell. enables adaptive memory usage recurrence neural attention, offering way weakly induce relations among tokens. system initially designed process single sequence also demonstrate integrate encoder-decoder architecture. experiments language modeling, sentiment analysis, natural language inference show model matches outperforms state art.",4 "orientation covariant aggregation local descriptors embeddings. image search systems based local descriptors typically achieve orientation invariance aligning patches dominant orientations. albeit successful, choice introduces much invariance guarantee patches rotated consistently. paper introduces aggregation strategy local descriptors achieves covariance property jointly encoding angle aggregation stage continuous manner. combined efficient monomial embedding provide codebook-free method aggregate local descriptors single vector representation. strategy also compatible employed several popular encoding methods, particular bag-of-words, vlad fisher vector. geometric-aware aggregation strategy effective image search, shown experiments performed standard benchmarks image particular object retrieval, namely holidays oxford buildings.",4 "variational bayesian inference hidden markov models multivariate gaussian output distributions. hidden markov models (hmm) used several years many time series analysis pattern recognitions tasks. hmm often trained means baum-welch algorithm seen special variant expectation maximization (em) algorithm. second-order training techniques variational bayesian inference (vi) probabilistic models regard parameters probabilistic models random variables define distributions distribution parameters, hence name technique. vi also bee regarded special case em algorithm. article, bring together train hmm multivariate gaussian output distributions vi. article defines new training technique hmm. evaluation based case studies comparison related approaches part ongoing work.",4 "searnn: training rnns global-local losses. propose searnn, novel training algorithm recurrent neural networks (rnns) inspired ""learning search"" (l2s) approach structured prediction. rnns widely successful structured prediction applications machine translation parsing, commonly trained using maximum likelihood estimation (mle). unfortunately, training loss always appropriate surrogate test error: maximizing ground truth probability, fails exploit wealth information offered structured losses. further, introduces discrepancies training predicting (such exposure bias) may hurt test performance. instead, searnn leverages test-alike search space exploration introduce global-local losses closer test error. first demonstrate improved performance mle two different tasks: ocr spelling correction. then, propose subsampling strategy enable searnn scale large vocabulary sizes. allows us validate benefits approach machine translation task.",4 "structured transforms small-footprint deep learning. consider task building compact deep learning pipelines suitable deployment storage power constrained mobile devices. propose unified framework learn broad family structured parameter matrices characterized notion low displacement rank. structured transforms admit fast function gradient evaluation, span rich range parameter sharing configurations whose statistical modeling capacity explicitly tuned along continuum structured unstructured. experimental results show transforms significantly accelerate inference forward/backward passes training, offer superior accuracy-compactness-speed tradeoffs comparison number existing techniques. keyword spotting applications mobile speech recognition, methods much effective standard linear low-rank bottleneck layers nearly retain performance state art models, providing 3.5-fold compression.",19 "building pattern recognition applications spare library. paper presents spare c++ library, open source software tool conceived build pattern recognition soft computing systems. library follows requirement generality: implemented algorithms able process user-defined input data types transparently, labeled graphs sequences objects, well standard numeric vectors. present high-level picture spare library characteristics, focusing instead specific practical possibility constructing pattern recognition systems different input data types. particular, proof concept, discuss two application instances involving clustering real-valued multidimensional sequences classification labeled graphs.",4 "interpreting outliers: localized logistic regression density ratio estimation. propose inlier-based outlier detection method capable identifying outliers explaining outliers, identifying outlier-specific features. specifically, employ inlier-based outlier detection criterion, uses ratio inlier test probability densities measure plausibility outlier. estimating density ratio function, propose localized logistic regression algorithm. thanks locality model, variable selection outlier-specific, help interpret points outliers high-dimensional space. synthetic experiments, show proposed algorithm successfully detect important features outliers. moreover, show proposed algorithm tends outperform existing algorithms benchmark datasets.",19 "synthesis recurrent neural networks dynamical system simulation. review several widely used techniques training recurrent neural networks approximate dynamical systems, describe novel algorithm task. algorithm based earlier theoretical result guarantees quality network approximation. show feedforward neural network trained vector field representation given dynamical system using backpropagation, recast, using matrix manipulations, recurrent network replicates original system's dynamics. detailing algorithm relation earlier approaches, present numerical examples demonstrate capabilities. one distinguishing features approach original dynamical systems recurrent networks simulate operate continuous time.",4 "vision-aided absolute trajectory estimation using unsupervised deep network online error correction. present unsupervised deep neural network approach fusion rgb-d imagery inertial measurements absolute trajectory estimation. network, dubbed visual-inertial-odometry learner (violearner), learns perform visual-inertial odometry (vio) without inertial measurement unit (imu) intrinsic parameters (corresponding gyroscope accelerometer bias white noise) extrinsic calibration imu camera. network learns integrate imu measurements generate hypothesis trajectories corrected online according jacobians scaled image projection errors respect spatial grid pixel coordinates. evaluate network state-of-the-art (soa) visual-inertial odometry, visual odometry, visual simultaneous localization mapping (vslam) approaches kitti odometry dataset demonstrate competitive odometry performance.",4 structured sparse modelling hierarchical gp. paper new bayesian model sparse linear regression spatio-temporal structure proposed. incorporates structural assumptions based hierarchical gaussian process prior spike slab coefficients. design inference algorithm based expectation propagation evaluate model real data.,19 "asynchronous partial overlay: new algorithm solving distributed constraint satisfaction problems. distributed constraint satisfaction (dcsp) long considered important problem multi-agent systems research. many real-world problems represented constraint satisfaction problems often present distributed form. article, present new complete, distributed algorithm called asynchronous partial overlay (apo) solving dcsps based cooperative mediation process. primary ideas behind algorithm agents, acting mediator, centralize small, relevant portions dcsp, centralized subproblems overlap, agents increase size subproblems along critical paths within dcsp problem solving unfolds. present empirical evidence shows apo outperforms known, complete dcsp techniques.",4 "new worst-case upper bound #xsat. algorithm running o(1.1995n) presented counting models exact satisfiability formulae(#xsat). faster previously best algorithm runs o(1.2190n). order improve efficiency algorithm, new principle, i.e. common literals principle, addressed simplify formulae. allows us eliminate common literals. addition, firstly inject resolution principles solving #xsat problem, therefore improves efficiency algorithm.",4 "ranked bandits metric spaces: learning optimally diverse rankings large document collections. learning rank research assumed utility different documents independent, results learned ranking functions return redundant results. approaches avoid rather unsatisfyingly lacked theoretical foundations, scale. present learning-to-rank formulation optimizes fraction satisfied users, several scalable algorithms explicitly takes document similarity ranking context account. formulation non-trivial common generalization two multi-armed bandit models literature: ""ranked bandits"" (radlinski et al., icml 2008) ""lipschitz bandits"" (kleinberg et al., stoc 2008). present theoretical justifications approach, well near-optimal algorithm. evaluation adds optimizations improve empirical performance, shows algorithms learn orders magnitude quickly previous approaches.",4 "automated thermal face recognition based minutiae extraction. paper efficient approach human face recognition based use minutiae points thermal face image proposed. thermogram human face captured thermal infra-red camera. image processing methods used pre-process captured thermogram, different physiological features based blood perfusion data extracted. blood perfusion data related distribution blood vessels face skin. present work, three different methods used get blood perfusion image, namely bit-plane slicing medial axis transform, morphological erosion medial axis transform, sobel edge operators. distribution blood vessels unique person set extracted minutiae points blood perfusion data human face unique face. two different methods discussed extracting minutiae points blood perfusion data. extraction features entire face image partitioned equal size blocks total number minutiae points block computed construct final feature vector. therefore, size feature vectors found total number blocks considered. five layer feed-forward back propagation neural network used classification tool. number experiments conducted evaluate performance proposed face recognition methodologies varying block size database created laboratory. found first method supercedes two producing accuracy 97.62% block size 16x16 bit-plane 4.",4 "tempeval-3: evaluating events, time expressions, temporal relations. describe tempeval-3 task currently preparation semeval-2013 evaluation exercise. aim tempeval advance research temporal information processing. tempeval-3 follows previous tempeval events, incorporating: three-part task structure covering event, temporal expression temporal relation extraction; larger dataset; single overall task quality scores.",4 "generating synthetic data text recognition. generating synthetic images art emulates natural process image generation closest possible manner. work, exploit framework data generation handwritten domain. render synthetic data using open source fonts incorporate data augmentation schemes. part work, release 9m synthetic handwritten word image corpus could useful training deep network architectures advancing performance handwritten word spotting recognition tasks.",4 "neural networks complex data. artificial neural networks simple efficient machine learning tools. defined originally traditional setting simple vector data, neural network models evolved address difficulties complex real world problems, ranging time evolving data sophisticated data structures graphs functions. paper summarizes advances themes last decade, focus results obtained members samm team universit\'e paris 1",4 "enumeration extractive oracle summaries. analyze limitations future directions extractive summarization paradigm, paper proposes integer linear programming (ilp) formulation obtain extractive oracle summaries terms rouge-n. also propose algorithm enumerates oracle summaries set reference summaries exploit f-measures evaluate system summaries contain many sentences extracted oracle summary. experimental results obtained document understanding conference (duc) corpora demonstrated following: (1) room still exists improve performance extractive summarization; (2) f-measures derived enumerated oracle summaries significantly stronger correlations human judgment derived single oracle summaries.",4 "scalable annotation fine-grained categories without experts. present crowdsourcing workflow collect image annotations visually similar synthetic categories without requiring experts. animals, direct link taxonomy visual similarity: e.g. collie (type dog) looks similar collies (e.g. smooth collie) greyhound (another type dog). however, synthetic categories cars, objects similar taxonomy different appearance: e.g. 2011 ford f-150 supercrew-hd looks 2011 ford f-150 supercrew-ll different 2011 ford f-150 supercrew-svt. introduce graph based crowdsourcing algorithm automatically group visually indistinguishable objects together. using workflow, label 712,430 images ~1,000 amazon mechanical turk workers; resulting largest fine-grained visual dataset reported date 2,657 categories cars annotated 1/20th cost hiring experts.",4 "bandits warm-up cold recommender systems. address cold start problem recommendation systems assuming contextual information available neither users, items. consider case access set ratings items users. existing works consider batch setting, use cross-validation tune parameters. classical method consists minimizing root mean square error training subset ratings provides factorization matrix ratings, interpreted latent representation items users. contribution paper 5-fold. first, explicit issues raised kind batch setting users items ratings. then, propose online setting closer actual use recommender systems; setting inspired bandit framework. proposed methodology used turn recommender system dataset (such netflix, movielens,...) sequential dataset. then, explicit strong insightful link contextual bandit algorithms matrix factorization; leads us new algorithm tackles exploration/exploitation dilemma associated cold start problem strikingly new perspective. finally, experimental evidence confirm algorithm effective dealing cold start problem publicly available datasets. overall, goal paper bridge gap recommender systems based matrix factorizations based contextual bandits.",4 "robust feature selection mutual information distributions. mutual information widely used artificial intelligence, descriptive way, measure stochastic dependence discrete random variables. order address questions reliability empirical value, one must consider sample-to-population inferential approaches. paper deals distribution mutual information, obtained bayesian framework second-order dirichlet prior distribution. exact analytical expression mean analytical approximation variance reported. asymptotic approximations distribution proposed. results applied problem selecting features incremental learning classification naive bayes classifier. fast, newly defined method shown outperform traditional approach based empirical mutual information number real data sets. finally, theoretical development reported allows one efficiently extend methods incomplete samples easy effective way.",4 "health analytics: systematic review approaches detect phenotype cohorts using electronic health records. paper presents systematic review state-of-the-art approaches identify patient cohorts using electronic health records. gives comprehensive overview commonly de-tected phenotypes underlying data sets. special attention given preprocessing in-put data different modeling approaches. literature review confirms natural language processing promising approach electronic phenotyping. however, accessibility lack natural language process standards medical texts remain challenge. future research develop standards investigate machine learning approaches best suited type medical data.",19 "neuro fuzzy systems: sate-of-the-art modeling techniques. fusion artificial neural networks (ann) fuzzy inference systems (fis) attracted growing interest researchers various scientific engineering areas due growing need adaptive intelligent systems solve real world problems. ann learns scratch adjusting interconnections layers. fis popular computing framework based concept fuzzy set theory, fuzzy if-then rules, fuzzy reasoning. advantages combination ann fis obvious. several approaches integrate ann fis often depends application. broadly classify integration ann fis three categories namely concurrent model, cooperative model fully fused model. paper starts discussion features model generalize advantages deficiencies model. focus review different types fused neuro-fuzzy systems citing advantages disadvantages model.",4 current state challenges automatic planning web service composition. paper gives survey current state web service compositions difficulties solutions automated web service compositions. first gives definition web service composition motivation goal it. explores need automated web service compositions formally defines domains. techniques solutions proposed papers surveyed solve current difficulty automated web service composition. verification future work discussed end extend topic.,4 "approximated computation belief functions robust design optimization. paper presents ideas reduce computational cost evidence-based robust design optimization. evidence theory crystallizes aleatory epistemic uncertainties design parameters, providing two quantitative measures, belief plausibility, credibility computed value design budgets. paper proposes techniques compute approximation belief plausibility cost fraction one required accurate calculation two values. simple test cases show proposed techniques scale dimension problem. finally simple example spacecraft system design presented.",4 "discussion: latent variable graphical model selection via convex optimization. discussion ""latent variable graphical model selection via convex optimization"" venkat chandrasekaran, pablo a. parrilo alan s. willsky [arxiv:1008.1290].",12 "riemannian stochastic quasi-newton algorithm variance reduction convergence analysis. stochastic variance reduction algorithms recently become popular minimizing average large, finite number loss functions. present paper proposes riemannian stochastic quasi-newton algorithm variance reduction (r-sqn-vr). key challenges averaging, adding, subtracting multiple gradients addressed notions retraction vector transport. present convergence analyses r-sqn-vr non-convex retraction-convex functions retraction vector transport operators. proposed algorithm evaluated karcher mean computation symmetric positive-definite manifold low-rank matrix completion grassmann manifold. cases, proposed algorithm outperforms state-of-the-art riemannian batch stochastic gradient algorithms.",4 "targeted advertising based browsing history. audience interest, demography, purchase behavior possible classifications ex- tremely important factors carefully studied targeting campaign. information help advertisers publishers deliver advertisements right audience group. how- ever, easy collect information, especially online audience limited interaction minimum deterministic knowledge. paper, pro- pose predictive framework estimate online audience demographic attributes based browsing histories. proposed framework, first, retrieve content websites visited audience, represent content website feature vectors; second, aggregate vectors websites audience visited arrive feature vectors representing users; finally, support vector machine exploited predict audience demographic attributes. key achieving good prediction performance preparing representative features audience. word embedding, widely used tech- nique natural language processing tasks, together term frequency-inverse document frequency weighting scheme used proposed method. new representation ap- proach unsupervised easy implement. experimental results demonstrate new audience feature representation method powerful existing baseline methods, leading great improvement prediction accuracy.",4 "fast algorithm datalog inexpressibility temporal reasoning. introduce new tractable temporal constraint language, strictly contains ord-horn language buerkert nebel class and/or precedence constraints. algorithm present language decides whether given set constraints consistent time quadratic input size. also prove (unlike ord-horn) language cannot solved datalog establishing local consistency.",4 "robust head pose estimation using contourlet transform. estimating pose head important preprocessing step many pattern recognition computer vision systems face recognition. since performance face recognition systems greatly affected poses face, estimate accurate pose face human face image still challenging problem. paper, represent novel method head pose estimation. enhance efficiency estimation use contourlet transform feature extraction. contourlet transform multi-resolution, multi-direction transform. order reduce feature space dimension obtain appropriate features use lda (linear discriminant analysis) pca (principal component analysis) remove ineffcient features. then, apply different classifiers k-nearest neighborhood (knn) minimum distance. use public available feret database evaluate performance proposed method. simulation results indicate superior robustness proposed method.",4 "inference probabilistic graphical models graph neural networks. useful computation acting complex environment infer marginal probabilities probable states task-relevant variables. probabilistic graphical models efficiently represent structure complex data, performing inferences generally difficult. message-passing algorithms, belief propagation, natural way disseminate evidence amongst correlated variables exploiting graph structure, algorithms struggle conditional dependency graphs contain loops. use graph neural networks (gnns) learn message-passing algorithm solves inference tasks. first show architecture gnns well-matched inference tasks. demonstrate efficacy inference approach training gnns ensemble graphical models showing substantially outperform belief propagation loopy graphs. message-passing algorithms generalize training set larger graphs graphs different structure.",4 "human-level cmr image analysis deep fully convolutional networks. cardiovascular magnetic resonance (cmr) imaging standard imaging modality assessing cardiovascular diseases (cvds), leading cause death globally. cmr enables accurate quantification cardiac chamber volume, ejection fraction myocardial mass, providing wealth information sensitive specific diagnosis monitoring cvds. however, years, clinicians relying manual approaches cmr image analysis, time consuming prone subjective errors. major clinical challenge automatically derive quantitative clinically relevant information cmr images. deep neural networks shown great potential image pattern recognition segmentation variety tasks. demonstrate automated analysis method cmr images, based fully convolutional network (fcn). network trained evaluated dataset unprecedented size, consisting 4,875 subjects 93,500 pixelwise annotated images, far largest annotated cmr dataset. combining fcn large-scale annotated dataset, show first time automated method achieves performance par human experts analysing cmr images deriving clinical measures. anticipate starting point automated comprehensive cmr analysis human-level performance, facilitated machine learning. important advance pathway towards computer-assisted cvd assessment.",4 "predicting citywide crowd flows using deep spatio-temporal residual networks. forecasting flow crowds great importance traffic management public safety, challenging affected many complex factors, including spatial dependencies (nearby distant), temporal dependencies (closeness, period, trend), external conditions (e.g., weather events). propose deep-learning-based approach, called st-resnet, collectively forecast two types crowd flows (i.e. inflow outflow) every region city. design end-to-end structure st-resnet based unique properties spatio-temporal data. specifically, employ residual neural network framework model temporal closeness, period, trend properties crowd traffic. property, design branch residual convolutional units, models spatial properties crowd traffic. st-resnet learns dynamically aggregate output three residual neural networks based data, assigning different weights different branches regions. aggregation combined external factors, weather day week, predict final traffic crowds every region. developed real-time system based microsoft azure cloud, called urbanflow, providing crowd flow monitoring forecasting guiyang city china. addition, present extensive experimental evaluation using two types crowd flows beijing new york city (nyc), st-resnet outperforms nine well-known baselines.",4 "human gender classification: review. gender contains wide range information regarding characteristics difference male female. successful gender recognition essential critical many applications commercial domains applications human-computer interaction computer-aided physiological psychological analysis. proposed various approaches automatic gender classification using features derived human bodies and/or behaviors. first, paper introduces challenge application gender classification research. then, development framework gender classification described. besides, compare state-of-the-art approaches, including vision-based methods, biological information-based method, social network information-based method, provide comprehensive review area gender classification. mean time, highlight strength discuss limitation method. finally, review also discusses several promising applications future work.",4 "clinical information extraction via convolutional neural network. report implementation clinical information extraction tool leverages deep neural network annotate event spans attributes raw clinical notes pathology reports. approach uses context words part-of-speech tags shape information features. hire temporal (1d) convolutional neural network learn hidden feature representations. finally, use multilayer perceptron (mlp) predict event spans. empirical evaluation demonstrates approach significantly outperforms baselines.",4 "classification fused images using radial basis function neural network human face recognition. efficient fusion technique automatic face recognition presented. fusion visual thermal images done take advantages thermal images well visual images. employing fusion new image obtained, provides detailed, reliable, discriminating information. method fused images generated using visual thermal face images first step. second step, fused images projected eigenspace finally classified using radial basis function neural network. experiments object tracking classification beyond visible spectrum (otcbvs) database benchmark thermal visual face images used. experimental results show proposed approach performs well recognizing unknown individuals maximum success rate 96%.",4 "cascaded segmentation-detection networks word-level text spotting. introduce algorithm word-level text spotting able accurately reliably determine bounding regions individual words text ""in wild"". system formed cascade two convolutional neural networks. first network fully convolutional charge detecting areas containing text. results reliable possibly inaccurate segmentation input image. second network (inspired popular yolo architecture) analyzes segment produced first stage, predicts oriented rectangular regions containing individual words. post-processing (e.g. text line grouping) necessary. execution time 450 ms 1000-by-560 image titan x gpu, system achieves highest score date among published algorithms icdar 2015 incidental scene text dataset benchmark.",4 "e-qraq: multi-turn reasoning dataset simulator explanations. paper present new dataset user simulator e-qraq (explainable query, reason, answer question) tests agent's ability read ambiguous text; ask questions answer challenge question; explain reasoning behind questions answer. user simulator provides agent short, ambiguous story challenge question story. story ambiguous entities replaced variables. turn agent may ask value variable try answer challenge question. response user simulator provides natural language explanation agent's query answer useful narrowing set possible answers, not. demonstrate one potential application e-qraq dataset, train new neural architecture based end-to-end memory networks successfully generate predictions partial explanations current understanding problem. observe strong correlation quality prediction explanation.",4 "two hilbert schemes computer vision. study multiview moduli problems arise computer vision. show moduli spaces always smooth irreducible, calibrated uncalibrated cases, number views. also show moduli spaces always embed (diagram) hilbert schemes, embeddings open immersions four views. approach also yields natural smooth cover classical variety essential matrices seems appeared literature date.",12 "stfcn: spatio-temporal fcn semantic video segmentation. paper presents novel method involve spatial temporal features semantic video segmentation. current work convolutional neural networks(cnns) shown cnns provide advanced spatial features supporting good performance solutions image video analysis, especially semantic segmentation task. investigate involving temporal features also good effect segmenting video data. propose module based long short-term memory (lstm) architecture recurrent neural network interpreting temporal characteristics video frames time. system takes input frames video produces correspondingly-sized output; segmenting video method combines use three components: first, regional spatial features frames extracted using cnn; then, using lstm temporal features added; finally, deconvolving spatio-temporal features produce pixel-wise predictions. key insight build spatio-temporal convolutional networks (spatio-temporal cnns) end-to-end architecture semantic video segmentation. adapted fully known convolutional network architectures (such fcn-alexnet fcn-vgg16), dilated convolution spatio-temporal cnns. spatio-temporal cnns achieve state-of-the-art semantic segmentation, demonstrated camvid nyudv2 datasets.",4 study similarity generator discriminator gan architecture. one popular generative model high-quality results generative adversarial networks(gan). type architecture consists two separate networks play other. generator creates output input noise given it. discriminator task determining input real fake. takes place constantly eventually leads generator modeling target distribution. paper includes study actual weights learned network study similarity discriminator generator networks. paper also tries leverage similarity networks shows indeed networks may similar structure experimental evidence novel shared architecture.,4 "random finite set model data clustering. goal data clustering partition data points groups minimize given objective function. existing clustering algorithms treat data point vector, many applications datum vector point pattern set points. moreover, many existing clustering methods require user specify number clusters, available advance. paper proposes new class models data clustering addresses set-valued data well unknown number clusters, using dirichlet process mixture poisson random finite sets. also develop efficient markov chain monte carlo posterior inference technique learn number clusters mixture parameters automatically data. numerical studies presented demonstrate salient features new model, particular capacity discover extremely unbalanced clusters data.",19 "fixed-point algorithms learning determinantal point processes. determinantal point processes (dpps) offer elegant tool encoding probabilities subsets ground set. discrete dpps parametrized positive semidefinite matrix (called dpp kernel), estimating kernel key learning dpps observed data. consider task learning dpp kernel, develop surprisingly simple yet effective new algorithm. algorithm offers following benefits previous approaches: (a) much simpler; (b) yields equally good sometimes even better local maxima; (c) runs order magnitude faster large problems. present experimental results real simulated data illustrate numerical performance technique.",4 "easy-setup eye movement recording system human-computer interaction. tracking movement human eyes expected yield natural convenient applications based human-computer interaction (hci). implement effective eye-tracking system, eye movements must recorded without placing restriction user's behavior user discomfort. paper describes eye movement recording system offers free-head, simple configuration. require user wear anything head, move head freely. instead using computer, system uses visual digital signal processor (dsp) camera detect position eye corner, center pupil calculate eye movement. evaluation tests show sampling rate system 300 hz accuracy 1.8 degree/s.",4 "dlpaper2code: auto-generation code deep learning research papers. abundance research papers deep learning, reproducibility adoption existing works becomes challenge. due lack open source implementations provided authors. further, re-implementing research papers different library daunting task. address challenges, propose novel extensible approach, dlpaper2code, extract understand deep learning design flow diagrams tables available research paper convert abstract computational graph. extracted computational graph converted execution ready source code keras caffe, real-time. arxiv-like website created automatically generated designs made publicly available 5,000 research papers. generated designs could rated edited using intuitive drag-and-drop ui framework crowdsourced manner. evaluate approach, create simulated dataset 216,000 valid design visualizations using manually defined grammar. experiments simulated dataset show proposed framework provide $93\%$ accuracy flow diagram content extraction.",4 "automated detection individual micro-calcifications mammograms using multi-stage cascade approach. mammography, efficacy computer-aided detection methods depends, part, robust localisation micro-calcifications ($\mu$c). currently, effective methods based three steps: 1) detection individual $\mu$c candidates, 2) clustering individual $\mu$c candidates, 3) classification $\mu$c clusters. second step motivated reduce number false positive detections first step evidence malignancy depends relatively large number $\mu$c detections within certain area. paper, propose novel approach $\mu$c detection, consisting detection \emph{and} classification individual $\mu$c candidates, using shape appearance features, using cascade boosting classifiers. final step approach clusters remaining individual $\mu$c candidates. main advantage approach lies ability reject significant number false positive $\mu$c candidates compared previously proposed methods. specifically, inbreast dataset, show approach true positive rate (tpr) individual $\mu$cs 40\% one false positive per image (fpi) tpr 80\% 10 fpi. results significantly accurate current state art, tpr less 1\% one fpi tpr 10\% 10 fpi. results competitive state art subsequent stage detecting clusters $\mu$cs.",4 "joint conditional estimation tagging parsing models. paper compares two different ways estimating statistical language models. many statistical nlp tagging parsing models estimated maximizing (joint) likelihood fully-observed training data. however, since applications require conditional probability distributions, distributions principle learnt maximizing conditional likelihood training data. perhaps somewhat surprisingly, models estimated maximizing joint superior models estimated maximizing conditional, even though latter models intuitively access ``more information''.",4 "regression respect sensing actions partial states. paper, present state-based regression function planning domains agent complete information may sensing actions. consider binary domains employ 0-approximation [son & baral 2001] define regression function. binary domains, use 0-approximation means using 3-valued states. although planning using approach incomplete respect full semantics, adopt lower complexity. prove soundness completeness regression formulation respect definition progression. specifically, show (i) plan obtained regression planning problem indeed progression solution planning problem, (ii) plan found progression, using regression one obtains plan equivalent one. develop conditional planner utilizes regression function. prove soundness completeness planning algorithm present experimental results respect several well known planning problems literature.",4 "online learning time series prediction. paper address problem predicting time series using arma (autoregressive moving average) model, minimal assumptions noise terms. using regret minimization techniques, develop effective online learning algorithms prediction problem, without assuming noise terms gaussian, identically distributed even independent. furthermore, show algorithm's performances asymptotically approaches performance best arma model hindsight.",4 "enhanced multiobjective evolutionary algorithm based decomposition solving unit commitment problem. unit commitment (uc) problem nonlinear, high-dimensional, highly constrained, mixed-integer power system optimization problem generally solved literature considering minimizing system operation cost objective. however, due increasing environmental concerns, recent attention shifted incorporating emission problem formulation. paper, multi-objective evolutionary algorithm based decomposition (moea/d) proposed solve uc problem multi-objective optimization problem considering minimizing cost emission multiple objec- tives. since, uc problem mixed-integer optimization problem consisting binary uc variables continuous power dispatch variables, novel hybridization strategy proposed within framework moea/d genetic algorithm (ga) evolves binary variables differential evolution (de) evolves continuous variables. further, novel non-uniform weight vector distribution strategy proposed parallel island model based combination moea/d uniform non-uniform weight vector distribution strategy implemented enhance performance presented algorithm. extensive case studies presented different test systems effectiveness proposed hybridization strategy, non-uniform weight vector distribution strategy parallel island model verified stringent simulated results. further, exhaustive benchmarking algorithms proposed literature presented demonstrate superiority proposed algorithm obtaining significantly better converged uniformly distributed trade-off solutions.",4 "adversarially regularized graph autoencoder. graph embedding effective method represent graph data low dimensional space graph analytics. existing embedding algorithms typically focus preserving topological structure minimizing reconstruction errors graph data, mostly ignored data distribution latent codes graphs, often results inferior embedding real-world graph data. paper, propose novel adversarial graph embedding framework graph data. framework encodes topological structure node content graph compact representation, decoder trained reconstruct graph structure. furthermore, latent representation enforced match prior distribution via adversarial training scheme. learn robust embedding, two variants adversarial approaches, adversarially regularized graph autoencoder (arga) adversarially regularized variational graph autoencoder (arvga), developed. experimental studies real-world graphs validate design demonstrate algorithms outperform baselines wide margin link prediction, graph clustering, graph visualization tasks.",4 "posterior mean super-resolution compound gaussian markov random field prior. manuscript proposes posterior mean (pm) super-resolution (sr) method compound gaussian markov random field (mrf) prior. sr technique estimate spatially high-resolution image observed multiple low-resolution images. compound gaussian mrf model provides preferable prior natural images preserves edges. pm optimal estimator objective function peak signal-to-noise ratio (psnr). estimator numerically determined using variational bayes (vb). solve conjugate prior problem vb exponential-order calculation cost problem compound gaussian mrf prior simple taylor approximations. experiments, proposed method roughly overcomes existing methods.",4 "neural machine translation rare words subword units. neural machine translation (nmt) models typically operate fixed vocabulary, translation open-vocabulary problem. previous work addresses translation out-of-vocabulary words backing dictionary. paper, introduce simpler effective approach, making nmt model capable open-vocabulary translation encoding rare unknown words sequences subword units. based intuition various word classes translatable via smaller units words, instance names (via character copying transliteration), compounds (via compositional translation), cognates loanwords (via phonological morphological transformations). discuss suitability different word segmentation techniques, including simple character n-gram models segmentation based byte pair encoding compression algorithm, empirically show subword models improve back-off dictionary baseline wmt 15 translation tasks english-german english-russian 1.1 1.3 bleu, respectively.",4 "dynamic jointrees. well known one ignore parts belief network computing answers certain probabilistic queries. also well known ignorable parts (if any) depend specific query interest and, therefore, may change query changes. algorithms based jointrees, however, seem take computational advantage facts given typically construct jointrees worst-case queries; is, queries every part belief network considered relevant. address limitation, propose paper method reconfiguring jointrees dynamically query changes. reconfiguration process aims maintaining jointree corresponds underlying belief network pruned given current query. reconfiguration method marked three characteristics: (a) based non-classical definition jointrees; (b) relatively efficient; (c) reuse computations performed jointree reconfigured. present preliminary experimental results demonstrate significant savings using static jointrees query changes considerable.",4 "role contrast regularity perceptual boundary saliency. mathematical morphology proposes extract shapes images connected components level sets. methods prove suitable shape recognition analysis. present method select perceptually significant (i.e., contrasted) level lines (boundaries level sets), using helmholtz principle first proposed desolneux et al. contrarily classical formulation desolneux et al. level lines must entirely salient, proposed method allows detect partially salient level lines, thus resulting robust stable detections. tackle problem combining two gestalts measure saliency propose method reinforces detections. results natural images show good performance proposed methods.",4 "considering users' behaviours improving responses information base. paper, aim propose model helps efficient use information system users, within organization represented is, order resolve decisional problems. words want aid user within organization obtaining information corresponds needs (informational needs result decisional problems). type information system refer economic intelligence system support economic intelligence processes organisation. assumption every ei process begins identification decisional problem translated informational need. need translated one many information search problems (isp). also assumed isp expressed terms user's expectations expectations determine activities behaviors user, he/she uses is. model proposing used conception process retrieving solution(s) responses given system isp based behaviours correspond needs user.",4 "open problems optimal adaboost decision stumps. significance study theoretical practical properties adaboost unquestionable, given simplicity, wide practical use, effectiveness real-world datasets. present open problems regarding behavior ""optimal adaboost,"" term coined rudin, daubechies, schapire 2004 label simple version standard adaboost algorithm weak learner adaboost uses always outputs weak classifier lowest weighted error among respective hypothesis class weak classifiers implicit weak learner. concentrate standard, ""vanilla"" version optimal adaboost binary classification results using exponential-loss upper bound misclassification training error. present two types open problems. one deals general weak hypotheses. deals particular case decision stumps, often commonly used practice. answers open problems immediate significant impact (1) cementing previously established results asymptotic convergence properties optimal adaboost, finite datasets, turn start convergence-rate analysis; (2) understanding weak-hypotheses class effective decision stumps generated data, empirically observed significantly smaller typically obtained class, well effect weak learner's running time previously established improved bounds generalization performance optimal adaboost classifiers; (3) shedding light ""self control"" adaboost tends exhibit practice.",4 "segmentation similarity agreement. propose new segmentation evaluation metric, called segmentation similarity (s), quantifies similarity two segmentations proportion boundaries transformed comparing using edit distance, essentially using edit distance penalty function scaling penalties segmentation size. propose several adapted inter-annotator agreement coefficients use suitable segmentation. show configurable enough suit wide variety segmentation evaluations, improvement upon state art. also propose using inter-annotator agreement coefficients evaluate automatic segmenters terms human performance.",4 "bayesian inference based stationary fokker-planck sampling. novel formalism bayesian learning context complex inference models proposed. method based use stationary fokker--planck (sfp) approach sample posterior density. stationary fokker--planck sampling generalizes gibbs sampler algorithm arbitrary unknown conditional densities. sfp procedure approximate analytical expressions conditionals marginals posterior constructed. stage sfp, approximate conditionals used define gibbs sampling process, convergent full joint posterior. analytical marginals efficient learning methods context artificial neural networks outlined. off--line incremental bayesian inference maximum likelihood estimation posterior performed classification regression examples. comparison sfp monte carlo strategies general problem sampling arbitrary densities also presented. shown sfp able jump large low--probabilty regions without need careful tuning step size parameter. fact, sfp method requires small set meaningful parameters selected following clear, problem--independent guidelines. computation cost sfp, measured terms loss function evaluations, grows linearly given model's dimension.",3 "constraint satisfaction generalized staircase constraints. one key research interests area constraint satisfaction problem (csp) identify tractable classes constraints develop efficient solutions them. paper, introduce generalized staircase (gs) constraints important generalization one tractable class found literature, namely, staircase constraints. gs constraints two kinds, staircase (ds) staircase (us). first examine several properties gs constraints, show arc consistency sufficient determine solution csp ds constraints. further, propose optimal o(cd) time space algorithm compute arc consistency gs constraints c number constraints size largest domain. next, observing arc consistency necessary solving dscsp, propose efficient algorithm solving it. regard us constraints, arc consistency known sufficient determine solution, therefore, methods path consistency variable elimination required. since arc consistency acts subroutine existing methods, replacing optimal o(cd) arc consistency algorithm produces efficient method solving uscsp.",4 "poisson inverse problems plug-and-play scheme. anscombe transform offers approximate conversion poisson random variable gaussian one. transform important appealing, easy compute, becomes handy various inverse problems poisson noise contamination. solution problems done first applying anscombe transform, applying gaussian-noise-oriented restoration algorithm choice, finally applying inverse anscombe transform. appeal approach due abundance high-performance restoration algorithms designed white additive gaussian noise (we refer hereafter ""gaussian-solvers""). process known work well high snr images, anscombe transform provides rather accurate approximation. noise level high, path loses much effectiveness, common practice replace direct treatment poisson distribution. naturally, lose ability leverage vastly available gaussian-solvers. work suggest novel method coupling gaussian denoising algorithms poisson noisy inverse problems, based general approach termed ""plug-and-play"". deploying plug-and-play approach problems leads iterative scheme repeats several key steps: 1) convex programming task simple form easily treated; 2) powerful gaussian denoising algorithm choice; 3) simple update step. modular method, like anscombe transform, enables developers plug gaussian denoising algorithms scheme easy way. proposed method bares similarity anscombe operation, fact based different mathematical basis, holds true snr ranges.",4 "lexical representation explains cortical entrainment speech comprehension. results recent neuroimaging study spoken sentence comprehension interpreted evidence cortical entrainment hierarchical syntactic structure. present simple computational model predicts power spectra study, even though model's linguistic knowledge restricted lexical level, word-level representations combined higher-level units (phrases sentences). hence, cortical entrainment results also explained lexical properties stimuli, without recourse hierarchical syntax.",16 "inverse reinforce learning nonparametric behavior clustering. inverse reinforcement learning (irl) task learning single reward function given markov decision process (mdp) without defining reward function, set demonstrations generated humans/experts. however, practice, may unreasonable assume human behaviors explained one reward function since may inherently inconsistent. also, demonstrations may collected various users aggregated infer predict user's behaviors. paper, introduce non-parametric behavior clustering irl algorithm simultaneously cluster demonstrations learn multiple reward functions demonstrations may generated one behaviors. method iterative: alternates clustering demonstrations different behavior clusters inverse learning reward functions convergence. built upon expectation-maximization formulation non-parametric clustering irl setting. further, improve computation efficiency, remove need completely solving multiple irl problems multiple clusters iteration steps introduce resampling technique avoid generating many unlikely clusters. demonstrate convergence efficiency proposed method learning multiple driver behaviors demonstrations generated grid-world environment continuous trajectories collected autonomous robot cars using gazebo robot simulator.",4