close

Вход

Забыли?

вход по аккаунту

?

JP2017108287

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2017108287
Abstract: A communication device, a control method, and a control program for suitably
controlling collection of sound by a microphone. A communication apparatus includes a
detection unit, a determination unit, and a directivity control unit. The detection unit 104 detects
the direction of the generation source of the sound collected by the microphone array including
the plurality of microphones at the own site where the communication device is installed. The
determination unit 105 corresponds to the area of the display image to be displayed on the
display device of the other site, and has a mask area for masking the display on the display
device of the other site in the image area including the direction of the detected source. It is
determined whether it is included. When it is determined that the image area includes the mask
area, the directivity control unit 106 controls the directivity of the microphone array so as to
limit the collection of sound from the range of the base corresponding to the mask area. Do.
[Selected figure] Figure 3
Communication apparatus, control method and control program
[0001]
The present invention relates to a communication apparatus, a control method, and a control
program.
[0002]
2. Description of the Related Art Conventionally, a communication system for a video conference
has been known which realizes a conversation between users at each site using a network such
03-05-2019
1
as the Internet.
In a communication system for video conferencing, images and sounds collected by cameras and
microphones installed at each site are transmitted and received between communication devices
at each site connected to the network, and display devices at other sites and By making it output
from a speaker, a video conference between bases is realized.
[0003]
In a video conference, a camera capable of shooting in all directions and a microphone having
directivity may be used. By using a camera capable of shooting in all directions, it becomes
possible to shoot all users participating in the conference. In addition, it is possible to clearly
collect the user's voice by using the directional microphone. Furthermore, there is also a
technology for controlling the direction of the camera toward the sound source direction of the
sound detected by the microphone for the purpose of realizing a smoother conversation.
[0004]
By the way, in a video conference, there may be a user who does not want to be shown in a video,
or a user who does not want to see it. In such a case, a mask area which is an area not displayed
on a video to be displayed on the display device is determined in advance, and a video on which
the mask area is subjected to mask processing according to pan, tilt and zoom of the camera is
displayed on the display.
[0005]
However, in the prior art, there is a problem that it is difficult to preferably control the collection
of sound by the microphone. For example, when displaying an image obtained by performing
mask processing on a mask area, there is a demand not to collect sounds generated in the mask
area. The prior art displays an image obtained by masking the mask area, and sounds generated
in the mask area are collected, so it can not be said that this requirement is satisfied.
[0006]
03-05-2019
2
The present invention has been made in view of the above, and an object thereof is to preferably
control the collection of sound by a microphone.
[0007]
In order to solve the problems described above and achieve the object, the communication device
according to the present invention is a communication device, and the sound collected by the
microphone array including a plurality of microphones at its own location where the
communication device is installed. Corresponding to the area of the display image to be displayed
on the display device at another site, and a display unit for detecting the direction of the
generation source, and displaying the mask on the display device in the image area including the
detected direction of the generation source A determination unit that determines whether or not
a mask area to be included is included, and when it is determined that the mask area is included
in the image area, the sound from the range of the own base corresponding to the mask area And
a directivity control unit configured to control directivity of the microphone array so as to limit
acquisition.
[0008]
According to one aspect of the present invention, it is possible to preferably control the collection
of sound by the microphone.
[0009]
FIG. 1 is a diagram showing an example of a system configuration of the communication system
according to the first embodiment.
FIG. 2 is a block diagram showing an example of the hardware configuration of the
communication apparatus according to the first embodiment.
FIG. 3 is a block diagram showing an example of a functional configuration of the communication
apparatus according to the first embodiment.
FIG. 4 is a diagram for explaining a setting example of the mask area according to the first
embodiment. FIG. 5 is a diagram for explaining an example of a sound collection range by the
microphone array according to the first embodiment. FIG. 6 is a diagram for explaining an
03-05-2019
3
example of a display image according to the first embodiment. FIG. 7 is a flow chart showing an
example of the flow of control processing by the communication apparatus according to the first
embodiment. FIG. 8 is a diagram for explaining an example of a display image according to a
modification of the first embodiment. FIG. 9 is a flowchart showing an example of the flow of
control processing by the communication apparatus according to the first embodiment.
[0010]
Hereinafter, embodiments of a communication apparatus, a control method, and a control
program according to the present invention will be described with reference to the
accompanying drawings. Hereinafter, as an example of a communication system to which the
communication device according to the present invention is applied, a video conference system
which enables a conference between geographically distant bases is exemplified. However, the
present invention is widely applicable to various communication systems that transmit and
receive video and sound between a plurality of communication apparatuses, and various
communication apparatuses used in the communication systems, and the following embodiments
will be described. It is not limited by the
[0011]
First Embodiment [System Configuration According to First Embodiment] The configuration of
the communication system according to the first embodiment will be described with reference to
FIG. FIG. 1 is a diagram showing an example of a system configuration of the communication
system according to the first embodiment.
[0012]
As shown in FIG. 1, the communication system 1 includes communication devices 100 installed
at a plurality of bases and a relay device 200. The communication devices 100 installed at a
plurality of bases and the relay device 200 can be connected to a network 2 such as the Internet
or a local area network (LAN) and can communicate with each other. Among them, the number of
communication devices 100 included in the communication system 1 corresponds to the number
of bases participating in a video conference or the like. In FIG. 1, the case where the
communication apparatus 100 is installed in each of the base A, the base B, and the base C is
taken as an example.
03-05-2019
4
[0013]
The communication device 100 transmits and receives various information to and from the
communication device 100 at another site via the relay device 200. The communication
apparatus 100 also controls the output of the received information. For example, the information
to be output is an image of each base taken by a camera, a sound of each base collected by a
microphone (mainly, a voice of a speaker), or the like. The communication device 100 may be a
dedicated terminal for a video conference, or may be a general-purpose terminal such as a PC
(Personal Computer), a smartphone, or a tablet terminal. The general-purpose terminal
implements each function of the communication apparatus 100 as one of the applications by
installing the control program according to the present embodiment. The relay device 200 is a
server device or the like that relays transmission of various information such as video and sound
between a plurality of communication devices 100 installed at each site.
[0014]
In the configuration described above, the communication apparatus 100 detects the direction of
the sound source of the sound collected by the microphone array including the plurality of
microphones at its own location. For example, in a video conference, the sound source is mainly
the speech by the speaker. The microphones constituting the microphone array have directivity,
and are distributed in the housing of the communication device 100. The communication device
100 performs control to switch between valid and invalid for each microphone, and integrates
the sounds collected by the microphones, thereby acquiring sounds in an arbitrary range at the
own site. From the above, in the present embodiment, by using a microphone array including a
plurality of directional microphones, a speaker who is a sound generation source in a video
conference can be realized by utilizing a time difference or the like of sound reaching each
microphone. Detect the direction of
[0015]
Then, the communication apparatus 100 corresponds to the area of the display image to be
displayed on the display device of the other base, and masks the display on the display device of
the other base in the image region including the detected direction of the generation source. It is
determined whether an area is included. The display image displayed on the display device of the
03-05-2019
5
other site corresponds to the image of the own site. In photographing at the own site, an
omnidirectional camera capable of photographing in all directions is adopted as one mode.
Further, in the present embodiment, by using a video including the direction of the source of the
sound detected based on the microphone array, the video of the speaker who can be the source
of the sound is transmitted to another site. For this reason, the image area corresponds to the
area of the image when cut out with the direction of the source of the detected sound as the
center based on the photographing of the own base by the omnidirectional camera. Further, the
communication apparatus 100 determines whether or not a mask area is included in the image
area based on mask data stored in advance. For example, the mask data is a coordinate range
with respect to the image of the own base taken by the omnidirectional camera. The user can set
in advance the coordinate range to be masked.
[0016]
Subsequently, when it is determined that the mask area is included in the image area, the
communication apparatus 100 sets the directivity of the microphone array to limit the collection
of sound from the range of the own site corresponding to the mask area. Control. For example,
when the mask area is included in the image area, the communication apparatus 100 may use
the microphone array to limit the collection of sound from the area corresponding to the mask
area among the range of the base corresponding to the image area. Control the directivity of each
included microphone. That is, communication device 100 restricts the collection of sound from
the range of the own base corresponding to the mask area when the image displayed on the
display device of the other base includes the mask area, and mainly the mask area Control the
directivity of the microphone array so as to collect sound from the range of the self-location
except for. As a result, the sound collection by the microphone can be suitably controlled in
response to a request that the user does not want to collect the sound generated in the mask
area.
[0017]
After that, the communication apparatus 100 executes mask processing based on the mask area,
generates a display image to be displayed on a display device at another site, and collects the
generated display image and the microphone array whose directivity is controlled. Output
information including sound is transmitted to the communication device 100 installed at another
site.
[0018]
03-05-2019
6
[Hardware Configuration of Communication Apparatus According to First Embodiment] Next, the
hardware configuration of the communication apparatus 100 according to the first embodiment
will be described with reference to FIG.
FIG. 2 is a block diagram showing an example of the hardware configuration of the
communication apparatus 100 according to the first embodiment.
[0019]
As shown in FIG. 2, the communication apparatus 100 includes a central processing unit (CPU)
11, a read only memory (ROM) 12, a random access memory (RAM) 13, a solid state drive (SSD)
15, and a media drive. 17, an operation unit 18, and a power switch 19. In addition, the
communication device 100 includes a network I / F 21, a camera 22, an imaging element I / F
23, a microphone array 24, a speaker 25, a voice input / output I / F 26, a display I / F 27, and
an external device. And a connection I / F 28.
[0020]
The CPU 11 controls the overall operation of the communication device 100. The CPU 11
controls the overall operation of the communication device 100 by executing a program stored
in the ROM 12 or the like using the RAM 13 or the like as a work area. The ROM 12 stores a
program for realizing the processing by the communication device 100. The RAM 13 is a work
area at the time of execution of a program stored in the ROM 12 or the like. The SSD 15 controls
reading and writing of data to the flash memory 14 capable of storing various programs and
various data. The media drive 17 controls reading and writing (recording) of data with respect to
the recording medium 16.
[0021]
The operation unit 18 is operated to select the communication apparatus 100 at another site to
which the communication apparatus 100 communicates, set the mask area, and perform other
various settings. For example, the operation unit 18 may include a mouse, a keyboard, a hard
key, and the like, or may be a touch panel and the like capable of operation input. The power
03-05-2019
7
switch 19 is used to switch on and off the power of the communication device 100. The network
I / F 21 is an interface for controlling connection to the network 2 and transmission and
reception of various information. The camera 22 shoots the inside of the base. The imaging
element I / F 23 is an interface for controlling the driving of the camera 22 under the control of
the CPU 11. For example, the camera 22 may be an omnidirectional camera capable of imaging
all directions in the base, or may be a camera capable of switching the imaging direction
following the direction of the sound detected based on the microphone array. Also good. In the
present embodiment, as an aspect of the camera 22, a case where an omnidirectional camera is
applied will be described as an example. The switching of the photographing direction may be
either digital or analog.
[0022]
The microphone array 24 collects sounds in the base with a microphone array configured of a
plurality of microphones, and inputs the collected sounds to the communication device 100. The
speaker 25 outputs a sound. The audio input / output I / F 26 controls directivity to the
microphone array 24 according to control by the CPU 11, processes an input of a signal (mainly
an audio signal), and controls a volume of the speaker 25 to output a signal Is an interface for
processing The display I / F 27 is an interface for transmitting video data to be displayed on the
display device 50 under the control of the CPU 11. For example, the display device 50 is a
projector, a liquid crystal panel, or the like externally attached to the communication device 100.
The external device connection I / F 28 is an interface for connecting various external devices to
the communication device 100. The communication apparatus 100 also has a bus 20 such as an
address bus or a data bus for electrically connecting the above-described units. The hardware
configuration shown in FIG. 2 is an example, and hardware other than the above may be added.
[0023]
[Functional Configuration of Communication Apparatus According to First Embodiment] Next, a
functional configuration of the communication apparatus 100 according to the first embodiment
will be described with reference to FIG. FIG. 3 is a block diagram showing an example of a
functional configuration of the communication apparatus 100 according to the first embodiment.
[0024]
03-05-2019
8
As shown in FIG. 3, the communication apparatus 100 includes an operation input reception unit
101, a display control unit 102, an audio output control unit 103, a detection unit 104, a
determination unit 105, a directivity control unit 106, and photographing. A control unit 107, an
image generation unit 108, an audio input control unit 109, and a transmission / reception
control unit 110 are included. The respective units may be realized by software (program) or
may be realized by a hardware circuit. The above-described units are functions implemented by,
for example, the CPU 11 executing a control program developed on the RAM 13 from the flash
memory 14 or the like.
[0025]
The operation input reception unit 101 receives various operation inputs by the user who uses
the communication device 100. Specifically, the operation input reception unit 101 receives an
input of information on various settings and information for power control in accordance with a
user operation on the operation unit 18, the power switch 19, and the like. FIG. 4 is a diagram for
explaining a setting example of the mask area according to the first embodiment. The left side of
FIG. 4 shows a bird's-eye view of a conference room of a certain base, and the right side of FIG. 4
shows a setting screen.
[0026]
For example, as shown in FIG. 4, four users of user A, user B, user C and user D participate in the
conference room of a certain base. Also, centering on the communication device 100 placed on a
circular table, the user A in the direction of 0 °, the user B in the direction of 90 °, the user C in
the direction of 180 °, and the user D in the direction of 270 ° It shall be. Here, a certain user
operates the communication apparatus 100 to set a mask area. Thus, the setting screen shown
on the right side of FIG. 4 is displayed and output on the display device 50. The setting screen
includes an omnidirectional video including a participant of the conference, an "add mask"
button, a "delete mask" button, an "OK" button, and a "cancel" button.
[0027]
In the setting screen, when the "add mask" button is pressed, a square formed by broken lines
appears. The range surrounded by the broken line corresponds to the mask area. The user
03-05-2019
9
designates an area to be masked (the size of a square surrounded by a broken line) and presses
the “OK” button. Thus, the communication apparatus 100 sets the designated range as mask
data. On the setting screen, it is possible to set a plurality of mask areas by pressing the "add
mask" button. If it is desired to delete a mask area that has already been set, it is sufficient to
select the mask area to be deleted and then press the “Mask Delete” button. The "cancel"
button is used when canceling from the setting screen. In this way, the user can set the mask
area in advance. In addition, the range for designating as a mask area | region may be arbitrary
shapes, without being restricted to a square.
[0028]
The display control unit 102 controls display processing on the display device 50. For example,
the display control unit 102 executes drawing processing and the like on the video at the other
site received from the communication device 100 at the other site, and outputs the processed
data to the display device 50. Thereby, the display device 50 displays and outputs an image
(display image) including the video at another site.
[0029]
The audio output control unit 103 controls output processing of sound to the speaker 25. For
example, the audio output control unit 103 decodes sound data at another site received from the
communication apparatus 100 at another site, and outputs the decoded data (mainly, audio data)
to the speaker 25. Thus, the speaker 25 reproduces and outputs audio data at another site.
[0030]
The detection unit 104 detects the direction of the sound source of the sound collected by the
microphone array 24 including the plurality of microphones at its own location. More
specifically, based on the collection of sounds by the microphone array 24 including a plurality
of microphones having directivity, the detection unit 104 uses the time difference or the like of
the sound reaching each microphone to detect the sound at the own site. The direction of the
speaker who is the source is detected. The direction of the speaker who is the source of the
sound is the direction to the communication device 100.
03-05-2019
10
[0031]
The determination unit 105 corresponds to the area of the display image to be displayed on the
display device 50 at the other site, and is a mask for masking the display on the display device
50 at the other site in the image area including the detected direction of the generation source. It
is determined whether an area is included. More specifically, the determination unit 105 includes
the mask region in the image region including the direction of the speaker that is the generation
source of the sound detected by the detection unit 104 based on the preset coordinate range of
the mask data. It is determined whether the For example, the image area corresponds to the area
of the image when cut out with the direction of the detected source as the center based on the
photographing of the own base by the omnidirectional camera.
[0032]
When it is determined that the mask area is included in the image area, the directivity control
unit 106 sets the directivity of the microphone array 24 so as to limit the collection of sound
from the range of the own base corresponding to the mask area. Control. More specifically, when
directivity determining section 105 determines that the mask area is included in the image area,
directivity control section 106 corresponds to the mask area within the range of the own base
corresponding to the image area. The directivity of each microphone of the microphone array 24
is controlled so as to limit the collection of sound from the range of the own site.
[0033]
FIG. 5 is a diagram for explaining an example of a sound collection range by the microphone
array 24 according to the first embodiment. For example, as shown in the upper part of FIG. 5, it
is assumed that a shaded mask area is included in a part of the image area surrounded by a thick
line including the direction of the sound source represented by the arrow. At this time, as shown
in the lower part of FIG. 5, the directivity control unit 106 restricts the collection of sound from
the range of the own base corresponding to the mask region out of the range of the own base
corresponding to the image region. The directivity of the microphone array 24 is controlled so as
to collect the sound from the range surrounded by the bold line.
[0034]
03-05-2019
11
The imaging control unit 107 controls imaging by the camera 22. For example, the
photographing control unit 107 controls start and end of photographing by the camera 22 which
is an omnidirectional camera, magnification, and the like. Further, the photographing control unit
107 outputs the image photographed by the camera 22 to the image generation unit 108.
[0035]
The image generation unit 108 executes mask processing based on the mask area, and generates
a display image to be displayed on the display device 50 at another site. More specifically, the
image generation unit 108 cuts out an image centered on the direction of the sound generation
source from the image output by the photographing control unit 107. The cut out image
corresponds to an image area. Then, based on the coordinate range of the mask data
corresponding to the mask area, the image generation unit 108 performs mask processing on the
cut out image, and generates a display image to be displayed on the display device 50 at the
other site. That is, the display image is an image cut out from the omnidirectional image in
accordance with a predetermined angle of view, with the direction of the sound generation
source as the center. In the mask processing, arbitrary masking may be applied such as filling in
a coordinate range to be masked with a single color or making it in a mosaic form. Thereafter,
the image generation unit 108 outputs the generated display image to the transmission /
reception control unit 110.
[0036]
The voice input control unit 109 controls input processing of sound from the microphone array
24. For example, the voice input control unit 109 encodes and encodes the sound in the base
collected by the microphone array 24 in an arbitrary encoding format such as PCM (Pulse Code
Modulation) (mainly voice data ) To the transmission / reception control unit 110.
[0037]
The transmission / reception control unit 110 controls transmission / reception of various
information with the communication apparatus 100 at another site via the relay apparatus 200
via the network 2. The transmission / reception control unit 110 corresponds to a
“transmission control unit” as one aspect. For example, the transmission / reception control
03-05-2019
12
unit 110 can set output information including the display image output by the image generation
unit 108 and the audio data output by the audio input control unit 109 at another site via the
relay device 200. It transmits to the communication apparatus 100. The transmission / reception
control unit 110 also receives output information from the communication device 100 installed
at another site. The transmission / reception control unit 110 outputs data relating to a display
image included in the received output information to the display control unit 102, and outputs
data relating to audio to the audio output control unit 103. As a result, under the control of the
display control unit 102 and the audio output control unit 103, video and audio of another site
are output.
[0038]
FIG. 6 is a diagram for explaining an example of a display image according to the first
embodiment. The upper part of FIG. 6 is an example of an image in which the direction of the
sound source is close up, and the lower part of FIG. 6 is an example of an omnidirectional image
including an image area and a mask area. The display device 50 may display and output the
close-up video as a display image, or may display and output the close-up video and the
omnidirectional video as a display image. For example, as shown in FIG. 6, in the omnidirectional
image, when a mask area is included in a part of the image area including the user B (speaker)
who is the sound generation source, the range represented by the horizontal line is the sound.
Range of Further, the close-up video includes the user B, a mask area filled with a single color,
and the like. From the speaker 25, collection of sound from the range corresponding to the mask
area is limited, and mainly, data of sound collected from the range represented by the horizontal
line is output.
[0039]
[Control Processing Flow According to First Embodiment] Next, the flow of control processing by
the communication apparatus 100 according to the first embodiment will be described with
reference to FIG. FIG. 7 is a flow chart showing an example of the flow of control processing by
the communication apparatus 100 according to the first embodiment.
[0040]
As shown in FIG. 7, based on the collection of sounds by the microphone array 24 including a
03-05-2019
13
plurality of microphones, the communication device 100 uses the time difference or the like of
the sound reaching each microphone to generate the sound generation source at the own site.
The direction is detected (step S101). Then, the communication apparatus 100 determines
whether or not the direction of the detected sound source is included in the mask area, based on
the preset coordinate range of the mask data (step S102). At this time, when the direction of the
sound generation source is included in the mask area (step S102: Yes), it is not preferable for the
communication device 100 to collect the sound from the detected direction. The process is
terminated without executing the sex control and the like. On the other hand, when the direction
of the sound generation source is not included in the mask area (step S102: No), the
communication apparatus 100 is an image area including the direction of the detected sound
generation source based on the coordinate range of the mask data. Then, it is determined
whether a mask area is included (step S103).
[0041]
When the mask area is included in the image area (step S103: Yes), the communication
apparatus 100 restricts the collection of sound from the range of the own base corresponding to
the mask area among the range of the own base corresponding to the image area To control the
directivity of each microphone of the microphone array 24 (step S104). In addition, when the
mask area is not included in the image area (step S103: No), the communication apparatus 100
does not limit the collection of the sound from the range of the own site corresponding to the
image area, and the sound from the generation source To control the directivity of each
microphone of the microphone array 24 (step S105).
[0042]
Then, the communication apparatus 100 cuts out an image from the captured image centering
on the direction of the sound generation source, and based on the coordinate range of the mask
data corresponding to the mask area, with respect to the cut out image as necessary. Mask
processing is performed to generate a display image to be displayed on the display device 50 at
another site (step S106). Subsequently, the communication device 100 transmits output
information including the display image and the audio data to the communication device 100
installed at another site via the relay device 200 (step S107).
[0043]
03-05-2019
14
[Effects of First Embodiment] As described above, when the communication device 100 detects
the direction of the sound generation source and the mask region is included in the image region
including the detected direction of the generation source, the mask region Since the directivity of
the microphone array 24 is controlled so as to limit the collection of sounds from the range of
the own base corresponding to the above, the collection of sounds by the microphone can be
suitably controlled. In other words, since the communication apparatus 100 mainly collects the
sound from the range of the own site excluding the mask area, the microphone does not want to
collect the sound from the range of the own site corresponding to the mask area. Sound
collection can be suitably controlled.
[0044]
(Modification of First Embodiment) In the first embodiment, the number of the mask region and
the non-mask region (hereinafter referred to as "non-mask region") in the image region is one.
Explained. In the first embodiment, the mask area corresponds to the range of the own base to
which the collection of sound is to be restricted, and the non-mask area corresponds to the range
of the own base to which the sound is mainly collected. There is. However, depending on the
setting of the mask area, the image area may include a plurality of mask areas and non-mask
areas. Therefore, in the modification of the first embodiment, a sound collection range in the case
where a plurality of non-mask areas are included in an image area will be described. The
hardware configuration of the communication apparatus 100 according to the modification of
the first embodiment is the same as the hardware configuration of the communication apparatus
100 according to the first embodiment. In the following, functions different from the
communication apparatus 100 according to the first embodiment will be described using FIG. 3
and the like.
[0045]
The determination unit 105 further determines whether the image area is divided into a plurality
of non-mask areas by the mask area. More specifically, based on the coordinate range of the
mask data set in advance, determination unit 105 sets a plurality of image regions including the
direction of the speaker that is the generation source of the sound detected by detection unit 104
by the mask region. It is determined whether or not the image is divided into non-masked areas.
[0046]
03-05-2019
15
When it is determined that the image area is divided into a plurality of non-mask areas by the
mask area, the directivity control unit 106 generates a sound from the range of the own site
corresponding to the area excluding the non-mask area including the generation source. Control
the directivity of the microphone array 24 so as to limit the collection of More specifically, when
directivity determining section 105 determines that the image area is divided into a plurality of
non-mask areas by mask area, directivity control section 106 determines the range of the own
base corresponding to the image area. Each microphone of the microphone array 24 is limited so
as to limit the collection of sound from the range of the own base corresponding to the mask
region and the range of the own base corresponding to the non-mask region not including the
direction of the sound source. Control the directivity of
[0047]
FIG. 8 is a diagram for explaining an example of a display image according to a modification of
the first embodiment. The upper part of FIG. 8 is an example of an image in which the direction
of the sound generation source is closed up, and the lower part of FIG. 8 is an example of an
omnidirectional image including an image area and a mask area. For example, as shown in FIG. 8,
in an omnidirectional image, a mask area existing in a part of an image area including a user B
(speaker) who is a sound generation source is divided into a plurality of non-mask areas The
range represented by the horizontal line is the range in which the sound is collected. From the
speaker 25, the collection of sound from the range of the own base corresponding to the mask
area and the non-mask area not including the user B (speaker) who is the source of sound is
limited, mainly by the horizontal line. Sound data collected from the range of the own base will
be output.
[0048]
[Control Processing Flow According to Modification of First Embodiment] Next, the flow of
control processing by the communication apparatus 100 according to the modification of the
first embodiment will be described with reference to FIG. FIG. 9 is a flowchart illustrating an
example of the flow of control processing by the communication apparatus 100 according to the
first embodiment. The description of the same steps as the flow of the control process according
to the first embodiment may be omitted. Specifically, steps S201 to S203 are the same as the
processes in steps S101 to S103. Moreover, step S207-step S209 are the same as the process in
step S105-step S107.
03-05-2019
16
[0049]
As shown in FIG. 9, when the image area includes the mask area (step S203: Yes), the
communication apparatus 100 determines whether the image area is divided into a plurality of
non-mask areas by the mask area. (Step S204). At this time, when the image area is not divided
into a plurality of non-mask areas (step S204: No), the communication apparatus 100 determines
from the range of its own base corresponding to the mask area among the range of its own base
corresponding to the image area. The directivity of each microphone of the microphone array 24
is controlled so as to limit the collection of sound (step S205). On the other hand, when the
image area is divided into a plurality of non-mask areas (step S204: Yes), the communication
apparatus 100 sets the non-mask area including the sound generation source within the range of
its own base corresponding to the image area. The directivity of the microphone array 24 is
controlled so as to limit the collection of sound from the range of the own base corresponding to
the excluded area (step S206).
[0050]
[Effects of Modification of First Embodiment] As described above, when the image area is divided
into a plurality of non-mask areas by the mask area, communication apparatus 100 excludes the
non-mask area including the sound source. Since the directivity of the microphone array 24 is
controlled so as to limit the collection of sound from the range of the own location corresponding
to the area, the collection of sound by the microphone can be controlled more suitably. In other
words, since the communication apparatus 100 mainly collects the sound from the range of the
own base corresponding to the area including the speaker as the sound generation source in the
non-mask area, the sound collection by the microphone is more preferable. Can be controlled.
[0051]
(Other Embodiments) Although the embodiments of the communication apparatus 100 according
to the present invention have been described, the present invention may be implemented in
various different modes other than the above-described embodiments. Thus, different
embodiments of (1) configuration and (2) program will be described.
03-05-2019
17
[0052]
(1) Configuration The information including the processing procedures, control procedures,
specific names, various data, parameters, etc. shown in the above documents or drawings etc. can
be arbitrarily changed unless otherwise specified. Further, each component of the illustrated
apparatus is functionally conceptual and does not necessarily have to be physically configured as
illustrated. That is, the specific form of the distribution or integration of the devices is not limited
to that shown in the drawings, but all or a part thereof may be functionally or physically
distributed or arbitrary in any unit according to various burdens, usage conditions, etc. It can be
integrated.
[0053]
In the above embodiment, the camera 22 is an omnidirectional camera. However, the camera 22
may not be an omnidirectional camera but may be a camera whose imaging direction can be
switched. When the omnidirectional camera is not adopted, the photographing control unit 107
controls the direction of photographing by the camera 22 centering on the direction of the sound
generation source. At this time, the image area corresponds to the area of the image when
photographed at the angle of view when the direction of the camera 22 is controlled centering on
the direction of the sound source. Further, when the display image is generated by the image
generation unit 108, the image is not cut out because it is not an omnidirectional image.
[0054]
Further, in the above embodiment, the case where the image area is divided into two non-mask
areas by one mask area has been exemplified. The number and the range of the mask area and
the non-mask area are one example, and the number and the range shown in the figure are not
limited. That is, the division of the non-mask area may have various cases other than the above,
and the image area may include a plurality of mask areas.
[0055]
(2) Program Also, as one mode, the control program executed by the communication device 100
is a file in an installable format or an executable format, such as a CD-ROM, a flexible disk (FD), a
03-05-2019
18
CD-R, a DVD It is recorded and provided in a computer readable recording medium such as
Digital Versatile Disk). Also, the control program executed by the communication apparatus 100
may be stored on a computer connected to a network such as the Internet and provided by being
downloaded via the network. Also, the control program executed by the communication device
100 may be provided or distributed via a network such as the Internet. Further, the control
program executed by the communication device 100 may be configured to be provided by being
incorporated in advance in a ROM or the like.
[0056]
The control program executed by the communication device 100 has a module configuration
including the above-described units (at least the detection unit 104, the determination unit 105,
and the directivity control unit 106), and the CPU is a storage medium as actual hardware. The
above-described units are loaded onto the main storage device by reading and executing the
program from the above, and the detection unit 104, the determination unit 105, the directivity
control unit 106, and the like are generated on the main storage device.
[0057]
100 communication apparatus 101 operation input acceptance unit 102 display control unit 103
voice output control unit 104 detection unit 105 determination unit 106 directivity control unit
107 imaging control unit 108 image generation unit 109 voice input control unit 110
transmission / reception control unit 200 relay device
[0058]
Patent No. 3722625 gazette Patent No. 5776313 gazette JP, 18-182979, A
03-05-2019
19
Документ
Категория
Без категории
Просмотров
0
Размер файла
33 Кб
Теги
jp2017108287
1/--страниц
Пожаловаться на содержимое документа