The aim of the present investigation was to evaluate the effect of visual feedback on rating voice quality severity level and the reliability of voice quality judgment by inexperienced listeners. For this purpose two training programs were created, each lasting 2 hours. In total 37 undergraduate speech–language therapy students participated in the study and were divided into a visual plus auditory-perceptual feedback group (V + AF), an auditory-perceptual feedback group (AF), and a control group with no feedback (NF). All listeners completed two rating sessions judging overall severity labeled as grade (G), roughness (R), and breathiness (B). The judged voice samples contained the concatenation of continuous speech and sustained phonation. No significant rater reliability changes were found in the pre- and posttest between the three groups in every GRB-parameter (all p > 0.05). There was a training effect seen in the significant improvement of rater reliability for roughness within the NF and AF groups (all p < 0.05), and for breathiness within the V + AF group (p < 0.01). The rating of the severity level of roughness changed significantly after the training in the AF and V + AF groups (p < 0.01), and the breathiness severity level changed significantly after the training in the V + AF group (p < 0.01). The training of V + AF and AF may only minimally influence the reliability in the judgment of voice quality but showed significant influence on rating the severity level of GRB parameters. Therefore, the use of both visual and auditory anchors while rating as well as longer training sessions may be required to draw a firm conclusion.